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SYMPOSIUM: EXAMINATIONS 


O-LEVEL GRADES AND TEACHERS’ ESTIMATES AS 
PREDICTORS OF THE A-LEVEL RESULTS OF UCCA 
APPLICANTS 


Ву В. J. L. MURPHY* 
(The Associated Examining Board Research Unit, Aldershot) 


Summary. The investigation reported in this paper studied the relationship between 
both GCE O-level grades and teachers’ estimates of A-level grades, and the actual 
A-level grades obtained, in individual subjects, by a sample of applicants for university 
places. Moderate levels of correlation were reported in both cases, although the 
teachers' estimates appeared to be slightly better predictors of the A-level grades. 
There also appeared, in both cases, to be some important differences between the levels 
of correlation reported in individual subjects. In addition it was observed that the 
teachers, on average, tended to over-predict the A-level grades. It was noticeable that 
both the O-level grades and the teachers’ estimates of the A-level grades were bunched 
together in a very narrow range at the top of the grading scale. The effect of this 
restriction amongst the predictor variables is discussed in terms of its influence on the 
levels of correlations reported. The application of certain corrections that are intended 
to compensate for this effect is also considered. 


EMA 
INTRODUCTION , 


THE results of public examinations are frequently used as predictive measures. There 
is not much information available, however, about how effectively they do predict 
future levels of performance. One type of prediction that has been well investigated 
is the use of secondary school leaving examinations as predictors of performance in 
higher education. Choppin (1972) and others have shown that there is not a very 
strong relationship between GCE A-level grades and university degree classification. 
In Scotland, Powell (1973), having demonstrated the poor predictive power of several 
measures of ability and past attainment, commented on the many other psychological 
and social factors that can influence performance in higher education. It might 
be thought that the relationship between the two main General Certificate of Education 
(GCE) examinations would be stronger, because both the Ordinary (O) and Advanced 
(А) level examinations normally are taken within the years of secondary schooling; 
O-levels are taken predominantly by 16-year-olds and A-levels by 18-year-olds 
and there is often a close connection between the syllabuses and the schemes of 
assessment for both examinations. 


Until recently, information on the relationship between O- and A-level grades has 
been virtually non-existent. The problems of collecting suitable data to study this 
relationship have been discussed elsewhere by Shoesmith (1970) and Massey (1978). 
In essence, the problem is to do with the fact that the examination boards do not 
store information about individual examination candidates in such a way that the 
results of candidates at different levels in different years can be compared. Shoesmith 
drew his samples manually from candidates who remained within the same schools 
and entered for the same board at both levels, and in doing so recognised the un- 
representativeness of his sample. Massey was able to draw on more representative 
data by restricting his study to the O- and A-level Nuffield Physics examinations, 
where a central record is kept of all of the candidates entering for these examinations 
through all of the GCE boards. Massey reported a correlation of 0-73 between the 
marks obtained by candidates sitting the O-level examination in that subject in 1973 
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and then going on to do the A-level in 1975. In addition, he suggested two aspects 
of the data which had probably deflated the value of this correlation: the restricted 
range of marks obtained by those candidates who were selected to go on to do A-level 
and the unreliability of the A-level marks. Using methods for correcting correlation 
coefficients suggested by Guilford and Fruchter (1973) and a somewhat optimistic 
estimate, derived from Nuttall and Willmott (1972), of 0-96 as the estimated reliability 
of the A-level marks, he obtained a corrected correlation coefficient of 0:84, which he 
concluded showed that it was ". . . likely that O-level performance accounts for 
approximately 70 per cent of the variance in A-level performance " in that subject. 


An alternative method of obtaining data concerning the O-level and A-level 
grades of the same candidates is to obtain it directly from their schools. This approach 
has been followed by Miles (1979) as part of a study into a variety of factors that 
affect attainment at 18--. Miles studied 6,229 A-level candidates from a selected 
group of schools and found O-level grades in the same subject to be by far the best 
single predictor of A-level performance. Не reported correlations between O-level 
and A-level grades of between 0-38 and 0-73, depending on the subject and on which 
boards' results were included in the analysis. He did not attempt to correct these 
correlations in the way that Massey did, but he did note that higher correlations 
would have been expected if the A-level candidates had been **. . . а less rigorously 
selected group ". It is noteworthy that Miles's highest level of correlation of 0-73 
for 600 Joint Matriculation Board (JMB) O- and A-level French candidates equals 
Massey's uncorrected figure for Nuffield Physics. Miles concludes, however, that 
even in this optimum case the O-level grades were only able to account for 42-6 per 
cent of the variance in the candidates’ A-level grades that he was not able to explain 
by other factors such as sex, social class and school type. Не also found that he was 
unable to improve on the predictive power of O-level in the same subject by con- 
sidering other cognate O-level subjects or by taking the overall mean performance at 
O-level as the predictor. 


It would be unrealistic to expect the relationship between O-level grades and 
A-level grades to be more than moderately high, because of all of the different factors 
that can play a part in determining each of them. The approximate nature of all 
examination grades is bound to keep the relationship between grades obtained on 
different occasions down below a certain level of correlation, even if the candidates 
are being examined on the same syllabus on both occasions. Where different examina- 
tions are being set on separate syllabuses with a time interval of two years between 
them, it is remarkable that correlations as high as 0-73 have been obtained between 
the results. One of the aims of the current investigation was to explore whether this 
relatively high level of correlation, which has been reported elsewhere, could be found 
between other similar examination results for a range of different subjects examined 
by the various GCE examining boards. 


An additional aspect of the present study involved looking at teachers’ estimates 
of A-level grades, to see whether or not these are better predictors of A-level results 
than are O-level grades. Previous work on teachers’ estimates of GCE grades has 
indicated that a moderately high level of correlation can be obtained by some teachers 
in some subjects. Another aspect of these studies has been the observation that 
teachers tend to over-predict grades rather than under-predict them (Petch, 1964; 
Ferguson and Garrett, 1977; Murphy, 1979). The records that were studied in the 
course of this investigation allowed an overall impression to be gained of the levels of 
correlation between predicted and obtained grades for a number of A-level subjects 
examined by the various GCE boards. These correlations will allow both overall and 
detailed comparisons to be made between the relative power of O-level grades and 
teachers’ estimates as predictors of A-level results. 


pel] 
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Sample 

АП of the data for the current investigation were drawn from the records of 
university applicants who were applying through the Universities Central Council on 
Admissions (UCCA) scheme for university entry in 1978. UCCA. handled approxi- 
mately 134,500 home applications in that year, and the analyses to be reported 
were conducted on a 10 per cent sample that is extracted every year for research 
purposes. This sample includes all those applicants who were born on the 5th, 15th 
or 25th of each month. 

At the time that the investigation was conducted, each application form included 
а full record of each applicant's O-level and A-level examination grades (where the 
applicant had, in fact, entered for these examinations), and a large number included 
details of teachers' pre-examination estimates of A-level grades. 


In the light of the discussion in the introduction to this paper, it can be seen 
that the UCCA records provide a quite unique source of readily available information 
concerning the relationship both between O-level grades and A-level grades and 
between predicted A-level grades and actual A-level grades. Clearly, the fact that 
these candidates were all applying for places at university suggests that they were not 
necessarily a representative sample of all candidates who enter for both O-level and 
A-level GCE examinations; we will return to discuss the implications of any such 
sample bias at a later stage in this paper. 


RESULTS 


Table 1 shows a board by board analysis of correlations between O-level grades 
and A-level grades, for those candidates who did both of these examinations in the 
same subject. This analysis is presented for the nine subjects that had the largest 
number of candidates in the sample qualifying for inclusion in the table. 


The results presented in Table 1 do not reveal any consistent differences in the 
levels of correlation reported for each of the boards; the correlations reported for 
individual subjects do vary somewhat from one board to another, but, given that the 
size of the groups is often quite small, this in itself is not all that remarkable. The 
one aspect of the grouped data that may be of some significance is the fact that the 
correlations for those candidates who did their O-levels and A-levels with two different 
examination boards are generally lower than the correlations for those candidates who 
remained with the same board. This result might have been predicted on the basis 
of syllabus differences, or in terms of differences in the standards between the boards. 


The most striking aspect of the data reported in Table 1 is the variation in the 
overall levels of correlation reported for each of the nine subjects. The most marked 
difference of this type was between Physics, Chemistry and French, which revealed 
relatively high correlations, and Geography, History and the two English subjects, 
which revealed much lower correlations. In this context it is noticeable that Physics 
and French were the subject areas where relatively high correlations had been reported 
previously by Massey (1978) and Miles (1979). It is, however, also noticeable that 
if the correlations from very small groups of candidates are disregarded, the highest 
overall correlations were somewhat below the highest correlations reported by Miles 
and Massey. Apart from the small groups the highest level of correlation achieved 
was 0-70, between the Board E O-level and A-level French grades. It is possible, 
however, that the lower levels of correlation reported in this study were partly due to 
the selection processes that decide which candidates apply for places at university. 
It will be seen, in a later section of the paper, that there were only a very few applicants 
who had received O-level grades lower than a C. This restriction in the variance of 
the O-level grades will be shown to have quite a considerable effect in deflating the 
values of the reported correlations. 
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Table 2 provides similar data for the nine subjects with the greatest number ої 
predicted and actual A-level grade comparisons. It is noticeable in this case that the 
correlations are higher than they were in Table 1, and this indicates that the teachers' 
estimates of the A-level grades were better predictors of the A-level grades than the 
O-level grades were. 


А further aspect of the data reported in Table 2 is that once again the highest 
correlations are achieved in relation to Physics, Chemistry and French grades, and 
the lowest correlations are reported in relation to Geography, History and English 
grades. In some ways this is a quite surprising finding and it does tend to suggest that 
A-level grades in some subjects are harder to predict, however one tries to predict 
them, than are the grades in other A-level subjects. It is also noticeable that, in both 
Tables 1 and 2, the highest correlations are reported for those subjects that have the 
reputation of being examined most reliably and the lowest correlations are reported 
for subjects which are thought to be more difficult to examine with a high degree of 
precision. This may suggest that the measurement reliability of the O-level and A-level 
grades has a substantial effect on the levels of correlation reported in each of the 
subjects. 

It is always dangerous to discuss correlation coefficients without being aware of 
the characteristics of the distributions of scores on which they are based. Table 3 
illustrates the sort of data on which the correlations reported in Table 1 were based, 
by providing a detailed analysis of the grades obtained by those candidates who did 
English Literature O-level and English A-level. 


Table 3 illustrates the fact that the great majority of the university applicants 
who attempted А-1еуе] in a subject that they bad already done at O-level had obtained 
a grade C or better at O-level. This is often a requirement for entry to A-level courses 
and as such is not all that surprising. It is noticeable, however, that the small number 
of students who did go on to do A-level after obtaining grades D, E and U at O-level 
tended to obtain fairly good results at A-level. This finding needs to be interpreted 
with extreme caution, because there are undoubtedly very strong selection factors 
involved in deciding which students with low O-level grades are allowed on to A-level 
courses. А similar finding was reported by the Scottish Certificate of Education 
Examination Board (SCEEB) in a study where they compared O-grade results with 
H-grade results, and found that students with low O-grade marks who went on to do 
H-grades had a better chance of obtaining good grades than students with somewhat 
better O-grade marks (Dunning Report, 1977). 


The very small number of students with low O-level grades who were included in 
the sample confirms the point mentioned earlier, that the correlations reported in 


TABLE 3 


A COMPARISON OF THE O-LEVEL ENGLISH LITERATURE GRADES AND A-LEVEL 
ENGLISH GRADES OBTAINED BY 3,202 UCCA APPLICANTS 








A-level Grades 

A B C D E о Е Total 

А 284 316 160 115 83 31 14 1003 

B 166 334 220 232 194 125 30 1301 

O-level C 39 149 121 144 150 132 51 786 
Grades р 2 7 8 7 11 9 5 49 
‘ Б 1 8 4 8 13 9 9 52 

у 2 1 1 4 1 2 0 11 
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Table 1 will have been considerably deflated by the restricted range of O-level grades. 
It is also clear, however, that there is only a very approximate relationship between 
the O-level grades А, В and C and the complete range of A-level grades. Candidates 
with grade A at O-level are almost as likely to obtain grade C or below at A-level as 
they are to obtain grades А or B. 


Table 4 provides a further detailed comparison, in this case for the teachers' 
predicted A-level grades and the actual grades obtained by the A-level English 
candidates. 


TABLE 4 


А COMPARISON OF PREDICTED GRADES WITH ACTUAL GRADES OBTAINED BY 2,004 
А-твуні, ENGLISH UCCA APPLICANTS 


Actual Examination Grades 











А В C D В о Е Том 

А 101 44 14 9 6 4 - 178 

А/В 55 68 29 15 14 4 - 185 

В 85 128 83 64 39 12 4 415 

B/C 38 83 65 49 30 16 5 286 

C 28 93 72 89 74 55 12 423 

Predicted C/D 8 30 27 44 37 27 3 176 
Grades D 5 20 30 39 45 29 9 177 
D/E 5 6 8 16 16 20 6 77 

B 1 3 3 10 22 20 17 76 

E/O - - - - - - 2 2 

о - - - 1 1 4 1 7 

O/F - - - - - - - 0 

F - - - - 1 - 1 2 

"Total 326 | 475 33i 336 285 191 60 2004 


One feature of the teachers' estimates that is revealed by Table 4 is that by using 
the intermediate grades (for example, A/B and B/C) they effectively stretch the range 
of points on the predicted grading scale. This effect is counteracted, however, by a 
tendency for the great majority of predictions to be made at the level of grade C or 
above. 


The results also illustrate quite large discrepancies between the predicted grades 
and the actual grades obtained, and there is evidence of a definite tendency, on the 
part of the teachers, to over-predict the A-level grades rather than under-predict 
them. This finding, which was confirmed in the results for the other subjects as well, 
supports the previous findings of Ferguson and Garrett (1977) and Murphy (1979), 
and has obvious implications for those who make use of teachers' estimates as opposed 
to actual A-level results. In this study, the teachers tended to over-predict the A-level 
grades by an average of approximately one grade per applicant. 


DISCUSSION 


The two main observations that can be made from the results reported in the 
previous section are the fact that A-level grades appear to be easier to predict in 
Physics, Chemistry and French than they are in Geography, History and English, and 
that teachers’ estimates appear to be better predictors of these grades than are O-level 
grades in the same subject. 

The second of these observations, along with any comparisons that might be 
drawn between the correlations reported in this study апа those reported elsewhere, 
depends to some extent on how much the levels of correlation reported in Tables 1 
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and 2 have been influenced by the restricted range of grades used in each case as the 
predictor variables. Guilford and Fruchter (1973) have suggested a correction that 
may be applied to correlation coefficients in order to compensate for a restriction of 
range. is correction is recommended for use where selection has taken place on the 
basis of the predictor variable and where the correlations obtained are to be applied 
to a wider population. 


If the results of the present investigation are only to be interpreted in terms of 
UCCA applicants, then there would seem to be no basis for correcting the correlations; 
however, if the results are to be generalised to a wider population of GCE candidates 
then there would seem to be a strong case for correcting the O-level with A-level 
correlations, in particular, to compensate for the high degree of selection that appears 
to have taken place on the basis of O-level grades. This effect is shown in Table 5, 
which compares characteristics of the O-level grade distribution of the UCCA sample 
with the same characteristics of the grade distribution of the total 1976 O-level entry. 
It can be seen from Table 5 that the candidates in the UCCA sample have a much 
higher mean grade than the 1976 O-level entry in each subject, and the standard 
deviations of their grade distributions are considerably lower in every case. It is the 
latter of these two features that potentially deflates the levels of correlation obtained. 
Guilford and Fruchter's correction estimates the levels of correlation that might have 
been obtained had a random sample of the O-level entry been used instead of this 
sample, who were apparently selected on the basis of their O-level grades. The 
corrected correlations are shown in Table 6. The levels of correlation reported there 
illustrate the considerable effect that the restricted range of O-level grades had on the 
results reported in Table 1. In French, for example, the original correlation of 0-57 
is corrected to a level of 0-90. 


It is highly unlikely that correlations as high as these could ever be obtained from 
a group of candidates who had done both O-level and A-level in the same subject, 
because it is almost inevitable that selection for A-level courses will always be based to 
a certain extent on O-level grades. These corrected correlation coefficients, however, 
do give us an insight into the levels of agreement that we might expect between O-level 
grades and A-level grades if a similar population of candidates entered for both 
examinations, or if a random sample of O-level candidates went on to do A-level. 


It is easier to estimate the effects of selection in restricting the range of O-level 
grades that were reported by the UCCA applicants than it is to estimate this effect 
on the teachers' estimates of A-level grades. One would certainly expect a random 


TABLE 5 


MEANS AND STANDARD DEVIATIONS BY SUBJECT OF O-LEVEL GRADE DISTRIBU- 
TIONS FOR UCCA APPLICANTS INCLUDED IN TABLE 1 AND THE TOTAL 1976 





O-LEVEL ENTRY 
UCCA Sample Total 1976 O-level Entry 

Grade Distribution Grade Distribution 

Mean SD Mean SD 
Physics 1:95 0:83 3 44 1-61 
Chemistry 2:00 0-85 3-42 1-65 
French 1-65 0-70 3:45 1:66 
Biology 1:91 0-84 3:54 1:65 
Mathematics 1-69 0:77 3:51 1:65 
Geography 2:02 0-80 3-62 164 
History 1:93 0:87 3:50 170 
English Language 1-88 0-82 3-42 1:54 
English Literature 2:02 0-90 3:49 1-59 
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TABLE 6 


CORRELATIONS BETWEEN O-LEVEL GRADES AND A-LBVEL 

GRADES CORRECTED FOR THE RESTRICTED RANGE OF O-LEVEL 

GRADES OBTAINED BY THE CANDIDATES IN THE UCCA 
SAMPLE 


Number of Corrected Correlation 





Candidates Coefficients 
Physics 3407 0-81 
Chemistry 3255 0-79 
French 1688 0-90 
Biology 2346 0:78 
Mathematics 3110 0:81 
Geography 1975 0:73 
History 2325 0-64 
English Language 3462 0-63 
English Literature 3202 0-59 


sample of A-level candidates to be given a greater range of predicted A-level grades 
than the UCCA sample were, but we will go no further than to say that some form of 
correction should almost certainly be applied to the correlations reported in Table 2 
if the results are to be generalised to all A-level candidates. 


It would be possible to enhance the levels of the reported correlations even further 
to account for the effect of correlating examination grades rather than raw examination 
marks, and also to account for the unreliability of the examination grades themselves. 
We will not apply either of these further corrections, however, because those who 
use these examinations for predictive purposes are most unlikely to have access to 
the raw examination marks, and the measurement unreliability of examination grades 
is one of the very real hazards that confront those who attempt to predict future 


examination performance. 
CONCLUSIONS 


The main conclusions to be drawn from this study are that there is a moderate 
level of agreement between the O-level grades and A-level grades obtained by UCCA 
applicants, and a slightly higher level of agreement between the teachers’ estimates 
and actual A-level grades. In neither case is the level of agreement high enough to 
make either O-level performance or the teachers’ estimates particularly accurate pre- 
dictors of A-level performance, and they should never be treated as more than a 
rough guide. There was additional evidence in this study that the teachers’ estimates 
of A-level performance tended to be optimistic rather than pessimistic, to the order 
of one grade per applicant per subject. 

It was particularly noticeable that the UCCA applicants had obtained relatively 
high O-level grades in the subjects that they were taking at A-level, and that the 
teachers’ estimates for their A-level grades were high also. The implications of these 
characteristics of the data have been discussed in terms of the effect that they have in 
lowering the values of the correlation coefficients observed. It must be concluded 
that, by analysing similar data from a different sample, somewhat higher valued 
correlation coefficients might be obtained. This would be particularly true if the 
sample included candidates who better represented the range of ability normally 
found amongst O-level candidates. 


Perhaps the most striking finding of the whole investigation was the much higher 
level of correlation that was found both between O-level grades and A-level grades 
and between predicted and actual A-level grades in Physics, Chemistry and French, 
than was found for Geography, History and English. It is difficult to know exactly 
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why results in the first group of subjects are so much easier to predict, but the most 
plausible explanation seems to be that grades in these subjects may be more reliable 
and less subject to many of the reliability of measurement problems that make 
predicting the results of educational examinations an extremely hazardous occupation. 


FOOTNOTE 

All of the statistical analyses were conducted on the basis of converting the O-level grades A, B, 
C, D, Е, U to a numerical scale 1, 2, 3, 4, 5, 6 and the A-level grades A, B, C, D, E, О, F toa 
numerical scale 1, 2, 3, 4, 5, 6, 7. 
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HOW VALID ARE SCHOOL EXAMINATIONS? AN 
EXPLORATION INTO CONTENT VALIDITY 


Ву В. HOSTE 
(Department of Education, University of Stirling) 


Summary. It is generally accepted that quantification of content validity is not possible. 
In this paper, a proposal is made by which a content validity coefficient, Cov can be 
calculated. Ап example of the use of the coefficient is given, demonstrating that different 
question combinations in a CSE biology examination in which a choice of questions 
was given gave different levels of content validity, The coefficient was used to demon- 
strate that deficiencies in validity in the examination are due more often to inadequate 
sampling of the educational objectives which the examination might be expected to test 
rather than to inadequate sampling of the subject matter. 


INTRODUCTION 


Ir is generally expected that public examinations offer three guarantees of quality. 
The first is that the examination is measuring what it purports to measure: that а 
test of history is not a test of English language skilis, but a test of history. This is 
referred to as the examination validity. The validity of public examinations has been 
the subject of only a few empirical investigations, e.g., Connaughton (1968), Mylrea 
(1974), Vincent (1975) and Hoste (1976, 1977). 

A second is that the same rank order of candidates will result irrespective of 
where the examination was taken, the conditions under which it was attempted, and 
of who marked the scripts. This is the examination’s reliability. Some considerable 
attention has been paid to reliability and it has been the subject of research studies 
by Connaughton (1968), Willmott (1972), Mylrea (1974), Willmott and Nuttall (1975) 
and Hoste (1977). 


Thirdly it is expected that standards (a) in different subjects within a board are 
similar, (5) in the same subject are the same whichever board sets the examination, 
and (c) candidates who sat the mode 1 examination set by the board have the same 
standards of attainment as those who sat a school’s mode 3 examination. Consider- 
able research effort has gone into ascertaining this comparability of public examina- 
tions, especially in regard to CSE examinations. Examples can be found in the work 
of Skurnik and Hall (1969), Skurnik and Connaughton (1970), Nuttall et al. (1974), 
Bloomfield et al. (1977) and Wilmott (1977). 


The research into validity, reliability and comparability 

It can be seen that there has been considerable research effort recently in England 
and Wales to establish the degree of comparability and (but to a lesser extent) the 
reliability of examinations. It is widely accepted that validity is, in fact, very much 
more important than reliability or comparability since if, for example, a geography 
examination is not measuring what is accepted as geography, it is irrelevant that it has 
high reliability and produces a stable rank order, or that whatever masquerades under 
the name of geography is comparable in standard with the grade in so-called geography 
produced by another board. Yet despite the arguably more important guarantee of 
validity, research studies are fewer. 


The nature of validity 
Validity is a complex topic accounting perhaps for some of the lack of progress 
in the area. A classification of the different types of validity was proposed by the 
American Psychological Association (1954): 
10 
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(1). Face validity is concerned with whether the examination looks as though it is ' 
testing what it purports to be testing. This is probably the most common test of 
validity used by examination boards in the preparation of their papers. Panels of 
experienced teachers of the subject, acting as a committee, scrutinise the proposed 
questions and reject any that do not appear to be valid tests of the subject. 


(2). Content validity is concerned with ** showing that the test items are a sample 
of a universe " (Cronbach and Meehl, 1955). French and Michael (1966) describe 
this ‘ universe " in the following way: " The aptitudes, skills and knowledge required 
of the student for successful test performance must be precisely the types of aptitudes, 
skills and knowledges that the school wishes to develop in students, and to evaluate 
in terms of test scores." Content validity, like face validity, is concerned with what is 
being examined, but is more precise in that it compares this with a previously defined 
specification, rather than with some vague notion in the mind of an ‘ expert’. 


(3). Criterion related validities. These forms of validity compare scores on an 
examination with scores on some external criterion. This may be a similar examina- 
tion taken at the same time (concurrent validity) ox it may be a criterion measure made 
some time in the future (predictive validity). 


(4). Construct validity. This form of validity is concerned with the match 
between the examination and those attributes which are presumed to underlie test 
performance. 

Connaughton, writing in 1969, expressed the view that: “ The content validity 
aspect of examinations will probably therefore, be quite closely studied in the coming 
years. There is still little sign of progress in the study of predictive and construct 
validity." Although it is true that examination boards are now using specification 
grids far more than they were when this was written, despite Connaughton's optimism, 
little advance has been made in the study of content validity. 


One of the problems which may contribute to the lack of progress in the 
theoretical study of content validity is that there is no accepted way in which it can 
be quantified. For example: "... quantitative evidence of content validity is not 
obtainable," (Lennon, 1956); " Content validity cannot be expressed as a validity 
coefficient," (Magnusson, 1967). 

The lack of progress since the 1950s and 1960s is shown by a more recent 
pronouncement: “ The validity of an examination cannot be measured precisely in 
the way that reliability can " (Dobby and Duckworth, 1979). 


This latter remark is rather surprising since concurrent and predictive validity 
have been quantified by correlation coefficients almost since psychometrics began. 
Factor analytical techniques can provide one source of evidence about construct 
validity, although they have not often been applied to the study of attainment 
examinations, except by Vincent (1975) and Hoste (1976, 1977). I would agree that 
face validity defies quantification, but I devote the remainder of this paper to a means 
of quantifying content validity, and demonstrating its application in relation to a 
CSE biology examination taken in 1972. 


METHOD 


The development of a specification for a CSE biology examination 

In order to have a standard for comparison it is necessary to derive a specification 
for an examination. This is conventionally prepared by determining the weighting 
of questions to be given to each subject area, and to each of the skills and abilities 
which the curriculum aims to develop in the pupils following it. In a study of item 
banking in biology, Duckworth and Hoste (1975) asked 93 teachers of CSE and 
GCE O-level biology in England to indicate the number of questions which would 
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need to be devoted to each subject area, and to each academic and practical skill 
(1.е., the * objectives" of their teaching) in biology to reflect the emphasis given to 
these areas and skills in their own teaching. 


The subject areas of school biology were identified by means of a scrutiny of 
CSE and ССЕ examining-board syllabuses and a list of curriculum objectives was 
drawn up after а review of relevant research and discussions with teachers. These 
subject areas and curriculum objectives are listed below. 


Biology subject areas 


1. Classification of animals and plants. 

2. Cell structure and function. 

3. Movement: muscular and skeletal systems (including plant skeleton). 
4. Respiration: gaseous exchange. 

5. Nutrition. 

6. Excretion. 

7. Transportation and circulation: movement of substances within an organism. 
8. Mead relations of animals and plants. 

9. Soil. 

10. Response to stimuli: animal behaviour. 

11. Reproduction. 

12. Evolution and genetics. 

13. Ecology: interdependence of plants and animals. 

14. Hygiene. 

15. Applied biology. 


Objectives of biology theory 


Candidates should 


. Know particular biological terminology. 

Know specific biological facts. 

. Identify form, structure, etc. and state their function. 

Know classification of organisms and the criteria employed. 

Know more important biological generalisations, principles and theories. 
Grasp the context of a biological text and summarise or explain its contents. 

. Interpret experimental data and draw reasonable conclusions from them. 

. Apply scientific principles in new situations in biology. 

. Analyse biological problems in order to determine & method for their solution. 
10. Identify cause-effect relationships and isolate relevant facts from irrelevant ones. 
11. Formulate a hypothesis from available evidence. 

12. Plan experiments to test a hypothesis. 

13. Evaluate conclusions in the light of the prodecures on which they are based. 
14. Prepare an effective report of an experiment. 


х со ма tà бо р а 


Objectives of practical biology 


Candidates should 


. Carry out written instructions for experimental procedures. 

. Set up and operate laboratory apparatus. 

Make systematic observations. 

. Make a labelled drawing of a specimen. 

Recognise specimens. 

Recall factual information about organisms observed or identified. 
. Record results of experiments or observations in tables. 

. Record results of experiments or observations graphically. 

. Record results of experiments or observations by annotated sketches, 
. Make calculations arising out of experiments. 

. Make deductions or hypotheses from observations or experiments. 
. Design and perform and experiment to test a hypothesis. 


ыыы ка 
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Using figures obtained by Duckworth and Hoste in a conventional specification 
grid or * blueprint ' (see Wood and Skurnik, 1969) the number of items which should 
be devoted to testing each skill or ability or educational objective within each subject 
area can be found by cross-multiplication of the marginal totals. 


But when this is done using Duckworth and Hoste's data in the margins of such a 
grid, a nonsense arises in that many cells contain fewer than one question. То test 
each objective in the context of each skill or ability it would be necessary to devise a 
test containing several hundred items. А solution is to reduce both the number of 
subject areas and objectives by reclassifying them into broader categories. This has 
been done in Table 1 where a grid of five subject areas, and five objectives has been 
produced for a 100-item practical biology test. 


Having produced a specification the next step is to calculate the number of items 
appropriate for each cell by cross-multiplying the marginal total. It is not wise to 
follow a strictly mechanical line, and accept the cross-products uncritically. It may 
be that certain educational objectives are more appropriate to certain subject areas, 
and some judgment in adjusting the cell totals may be necessary at this stage to give 
an educationally sound specification. 


We are now in a position to compare an existing test with the specification to 
learn about the content validity of the test. The most readily available practical 
biology examination (a CSE examination taken in 1972) consisted of only 26 items. 
This may sound rather short, but in fact it was the longest available for analysis. It 
was therefore necessary to reduce the specification grid yet further to 26 items. This 
was done by adjusting the totals in the margins and hence in each cell. This has been 
done in Table 2 which shows the ‘ideal’ 26-item test, which can now be used as a 
basis for comparison with the actual examination. 


The items in the examination were classified independently by five experienced 
biology teachers, with the result shown in Table 3 which shows the question classifi- 
cation matrix for this examination. There is a problem here, since I am accepting 
the ‘ face validity ° of each item as determined by the panel of experts. There are 
pitfalls in this, since what an item looks as though it is testing may be quite different 


TABLE 1 


SPECIFICATION GRID FOR 100 ITEM BIOLOGY PRACTICAL EXAMINATION BASED UPON THE TEACHERS’ 
OPINIONS OF THE WEIGHTINGS OF SUBJECT AREAS AND CURRICULUM OBJECTIVES 











Curriculum objectives 
Subject area Manipulative Visual Recording Calculating Intellectual Total 

Anatomy and 18-56 15-08 12-18 2:90 9-28 58 
physiology 19 15 12 3 9 

Classification 1 zas 2-73 са 18 13 

3 

Evolution and 4-16 3-38 2:73 0-65 2-08 13 
genetics 4 3 3 1 1 

Ecology 3:52 2-86 2:31 0:55 1:76 11 
3 3 2 1 2 

Applied biology 1:60 1-30 1-05 0-25 0-80 5 
2 1 1 0 1 

Total 32 26* 21 5* 16 100 


The upper figure in each cell gives the calculated number of questions and the lower figure gives the 
rounded off totals of question in each cell. 


* Because of rounding off, the numbers in these columns do not tally exactly with the ideal 
no. of questions derived from teachers' opinions. 
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TABLE 2 
THE SPECIFICATION GRID FOR А PRACTICAL EXAMINATION OF TWENTY-SIX ITEMS 





Curriculum objectives 








Total 
: | Rounded 
Subject area Manipulative Visual Recording Calculating Intellectual off Ideal 
Anatomy and 4:85 3:95 3:19 0:75 244 15 154 
physiology 5 4 3 1 2 
Classification 1:08 hee 072 rx 0-55 4 3:4 
1 1 1 
Evolution and 1:08 0-89 0:72 0:17 0:55 3 3-4 
genetics 1 1 1 0 1 
Ecology а T Te de 0-47 3 2:9 
0 
Applied biology 0-41 0-34 0-27 0-06 0:21 0 13 
0 0 0 0 0 
Totals: 
rounded off 9 7 6 1 4 26 
ideal 8-3 6:8 5:5 1-3 4-2 
ТАВГЕ 3 
QUESTION CLASSIFICATION GRID SHOWING DISTRIBUTION OF ITEMS IN THE 
PRACTICAL EXAMINATION 


Subject area Manipulative Visual Recording Calculating Intellectual 





Anatomy and 

physiology 1 4 2 0 12 
Classification 0 4 0 0 3 
Evolution and 

genetics 0 0 0 0 0 
Ecology 0 0 0 0 0 
Applied biology 0 0 0 0 0 


from what й is in fact testing. But until the state of our knowledge of construct 
validity, i.e., the construct underlying performance in an examination which each 
item is testing, is improved, we have nothing to go on except the face validity of the 
items. The ‘ experts’ were not always unanimous. Where there were differences of 
opinion, the majority view was taken; where there was no clear majority 1 contributed 
a casting vote. There was more unanimity about the subject areas being tested than 
about the curriculum objectives dealt with. 


It can be seen that there are differences between the teachers' opinion (shown by 
the specification matrix in Table 2) and the examination (shown in the question 
classification matrix in Table 3). То be fair, the examination was not prepared to 
this specification and there is no reason why it should be expected to conform to it, 
except that the specification reflects the opinions of a large number of biology 
teachers and a regional examination, such as the one studied, might be expected to 
show considerable agreement with teachers' opinions. 


Towards a content validity coefficient 

Validity coefficients are basically correlation coefficients. А predictive validity 
coefficient, for example, is a correlation coefficient showing the strength of the 
relationship between candidates' scores on one test and their scores on another. A 
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Pearson correlation coefficient could be calculated to show the strength of the 
relationship between the specified number of items in each cell of the specification 
matrix, and the number of items which actually occur in the cell. But, unfortunately, 
product moment correlation coefficients make a number of assumptions about the 
data which are not likely to be met. They assume that the measurements are at least 
made on an interval scale, that the data are normally distributed, and that the variances 
in the two sets of data being compared are roughly equal. These assumptions may 
not be acceptable, therefore we must look for a form of correlation coefficient which 
does not rely on these assumptions. 


The quantification of content validity 

In content validity we are looking for an association between the number of 
items specified for the cells in a test construction matrix, and the number of items 
which actually occur in the cells. For this, rank order coefficients such as Spearman's 
rho and Kendall’s tau, are ruled out because of the complications resulting from tied 
ranks. Ап obvious contender is the contingency coefficient, C. Siegel (1956) 
comments: " The contingency coefficient, C, is a measure of the extent of association 
or relation between two sets of attributes. It is uniquely useful when we have only 
categorical (nominal scale) information about one or both sets of these attributes." 
C is derived by calculating x? in the normal way: by taking the ‘ expected’ values 
(i.e., those specified for each cell), E, and the ‘ observed.’ values (1.е., those actually 
occurring in each cell), O, and applying the formula: 








(0-Е)? 
d E 
The value of x? so obtained is then substituted in the equation: 
à | x 
(/+%) 


(where М = по. of cases, i.e., the no. of items in the test). 


One disadvantage of the use of C as a coefficient of content validity is that it 
“works the other way round’ from other validity coefficients, i.e., when there is 
complete agreement of the ‘ observed’ (question classification) matrix with the 
‘expected’ (specification) matrix, C is equal to О, and increases as agreement 
between the two matrices becomes less. This may be easily overcome by proposing 
а content validity coefficient which is 1 — C, or 


х? 
Е в +49 


This coefficient is referred to in future as С, With Con, perfect agreement between 
the two matrices result in a value of 1:00, and becomes smaller as the content validity 
of the test becomes lower. 


There are some limitations to the use of С, as a content validity coefficient. 
One is that when the two matrices coincide exactly, whilst there is an upper limit of 
1-00 (equivalent to a value of C = 0:00, which is perfectly satisfactory) the coefficient 
cannot reduce to zero. This is because the lower limit to C,, is a function of the 
number of cells in the matrix. Siegel (1956) quotes maximum values of C of 0-707 
for a 2x2 matrix and 0-816 for a 3x3 matrix; this means that a 2x2 text matrix 
will give a minimum value for C,, of 0-293 (1.е., 1—0-707) and а 3x3 matrix a 
minimum value of 0-184. This is not seen however as a major difficulty, since tests 
with C,, approaching these values show very little content validity, and may be 
classed together as unsatisfactory. 
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Like other validity coefficients, C,, is not linear, 1.е., a value of C,, of 0-8 is not 
‘twice as good ° as one of 0-4. 


С.» cannot be directly compared with other validity coefficients, nor are two 
C,, coefficients directly comparable unless they are derived from matrices of the same 
size. Thus, if one test has been assessed for content validity against a 2 x 2 specifi- 
cation matrix, the resultant value of С,, cannot be compared with one for a test with 
a 3x3 matrix. However, the significance of the two coefficients can be calculated, 
and compared. As Siegel puts it: “... in the course of computing the value of C we 
compute a statistic which itself provides a simple and adequate indication of the 
significance of C. This statistic is of course х2. We may test whether an observed C 
differs significantly from chance by determining whether the x? for the data is signifi- 
cant " (Siegel, 1956). But in the case of a test specification matrix, we are not dealing 
with a random matrix, nor are we looking for a deviation of the observed matrix 
from chance. Therefore we must calculate the number of degrees of freedom in a 
way other than that which would be appropriate for a random matrix. In the present 
case, because the number of items in the final cell of the matrix is not determined until 
all but one of the cells is filled, the formula: df — (kr— 1) is used (where k — no. of 
columns in the matrix, and r — no. of rows). 


The content validity coefficient, Coo, in use 

The theory paper associated with the practical test referred to earlier consisted of 
two parts. Part 1, containing 25 multiple-choice items was compulsory; part 2 
allowed a choice of two out of five structured questions. Candidates could therefore 
choose one of ten different question combinations of questions in Part 2 (1 and 2, 
1 and 3, etc., through to 4 and 5). This question choice produced ten groups of 
candidates each attempting what was in effect a different examination. Comparisons 
have been made between the * expected’ distribution of items in the specification 
matrix, and each of ten ‘ observed ’ distributions arising from each of the ten different 
combinations of questions. The expected ° specification grid, based on the consensus 
opinion of the teachers consulted is shown in Table 4. As in the grid for the practical 
examination (Table 2), the ‘ subject areas ' categories, and the ‘ educational objectives > 
categories have been reduced in number to produce a matrix capable of meaningful 
interpretation. 


TABLE 4 


SPECIFICATION Grip For 100 Irem Вогосу THEORY EXAMINATION DERIVED FROM THE TEACHERS’ 
OPINIONS OF SUBIECT ARBAS AND CURRICULUM OBJECTIVES 





Curriculum objectives 





Subject area Knowledge Comprehension Application Analysis Synthesis Evaluation Total 














Anatomy and 28-42 8-70 4:06 5:22 9.28 2:32 58 
physiology 29 9 4 5 9 2 

Classification 5 195 т p Cid 052 13 
1 

Evolution 6:37 1:95 0:91 147 2-08 0-52 13 
6 2 1 1 2 1 

Ecology 5:39 1:65 0-77 0-99 176 0-44 и 
5 2 1 1 2 0 

Applied 2:45 0-75 0-35 0-45 0-80 0:20 5 
biology 3 0 о 1 1 0 

Total . 49 15 7 9 16 4 100 


The upper figure in each cell gives the calculated number of questions and the lower figure gives the 
rounded-off totals of questions in each cell. 
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When the items which each population of candidates (arising from the different 
question combination selected) attempted are classified, the resultant matrices, which 
include the items in part 1 and part 2 of the theory paper, are seen to be rather biased. 
Many blank cells appear in the question classification matrices as, for example, in 
that for question combination 2 and 3 shown in Table 5. There is an undue concen- 
tration of questions testing knowledge of anatomy and physiology. In order to 
calculate х2 the marginal categories must be aggregated if fewer than 20 per cent of the 
cells have an expected frequency of less than 5, and if any cell has an expected 
frequency of less than 1 (Seigel, 1956). In Table 6 the marginal categories have been 
collapsed to form 2x2 matrices in which one subject area category consists of 
‘anatomy and physiology’, the other contains all other subject areas (‘OSA’). 
The educational objectives margin has been collapsed to ‘ knowledge’ (in the sense 
used by Bloom et al., 1956), and ‘ other curriculum objectives ' (ОСО). 


When this is done the specification grids (the number of items in each question 
combination varies because of the different number of items in each question in 
Part 2) can be accurately compared using у? and the values of C,, derived from х2. 
This has been done in Table 8 where it can be seen that the values of xy? range from 
27-14 (indicating a high degree of deviation of the observed values from the expected 
values) to 6-64 (indicating a much lower disparity). Values of C,» corresponding to 
these y? values range from 0-41 to 0-69 according to this same degree of disparity 
between the test specification matrix and the question classification matrix. Using 
the levels of significance derived from the y? tables (df = 3) it can be seen that six 
question combinations give a highly significant deviation from the specification 
matrix (P«0-001); another two question combinations have a fairly significant 
departure from the expected (Р <0:05). The remaining two combinations do not 
differ significantly from their respective test specification matrices at the 0-05 level. 
In other words, eight out of ten question combinations in this examination produce 
8 distribution of items which differs significantly from the specification which a 
consensus of biology teachers consider would reflect the emphasis in their teaching of 
the subject. 


In this case we cannot discuss the validity of a whole examination, because the 
practice of allowing a choice of questions permits candidates to select certain 
combinations which have different content validity to other combinations. It might 
be, however, that individual pupils are selecting a combination which is most valid 
for them, or for the course of instruction which they have followed. What is valid for 
the teaching emphases of one school in particular may not be represented in а 
consensus such as the one which has been adopted for the derivation of the item 


TABLE 5 


QUESTION CLASSIFICATION GRID SHOWING THE 
DISTRIBUTION OF ITEMS IN ONE OF THE TEN 
QUESTION COMBINATIONS 





Questions 2 and 3 
К C A A 5 Е Total 








Anatomy and 

physiology 21 0 1 0 0 0 32 

Classification 5 2 0 000 7 

Evolution and 

genetics 10 5 060 0 0 0 15 

Ecology 3 2 0000 5 

Applied biology 0 0 0 0 0 0 0 
Total 39 19 1 0 0 0 59 
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specification grid. If C,, were to be calculated against a specification matrix which 
was based upon candidates' own teacher's classroom emphases, its value may well rise. 


The above point does not, however, detract from the value of a coefficient such 
as Ca as a means of quantifying content validity. Indeed, the use of С,, can be 
taken further to investigate the source of any invalidity we have detected in the item 
classification matrices. 


The source of content invalidity 

By condensing the matrices in Table 6 still Мића, as in Tables 7 and 8, it is 
possible to determine the source of the disparities between the specification and the 
question classification matrices. 


In question combination 3 and 5 (Table 7) 32 items should have tested anatomy 
and physiology, whilst 23 should have tested other subject areas. The question 
classification matrix shows that in the observed distribution of questions, only 18 
test anatomy and physiology, whilst 37 test other subject areas. The calculated value 
of C,, is 0:54 based on a value of 14-64 for х2, and is significant at the 0-001 level 
(df = 1). Thus this combination of questions differs significantly from the specifi- 
cation in terms of the subject areas it tests. In contrast, the question classification 
matrix for questions 1 and 5 shows complete agreement for the specification, resulting 
in a value of 1-00 for C,,, 0-00 for x2, and obviously does not show any significant 
difference. 


Taking Table 7 overall, the question classification matrices for six question 
combinations do not show significant differences, the values of С,, ranging from 0:81 
to 1:00. Three show differences which are significant at the 0:01 level, with values 
of Co ranging from 0-61 to 0-64. The tenth combination, showing a highly significant 
difference between the specification and the observed question distribution, has a 
C,, value of 0-54. 


TABLE 6 


SPECIFICATION GRIDS, QUESTION CLASSIFICATION GRIDS, VALUES OF Cop AND 
x2 FOR EACH COMBINATION OF QUESTIONS IN THE WRITTEN PAPER 





Question Specification Question Classi- Sig 
Combination Grid fication Grid Са x? (df = 3) 
K осо к ОСО 
1&2 АР 16 17 33 12 0-44 26.19#** 
OSA 12 12 8 4 
1&3 AP 15 16 22 4 0:51 17:08*** 
OSA 11 11 18 9 
1&4 АР 15 15 31 9 0-41 27.74*** 
ОЗА 11 11 8 44 
1&5 AP 15 15 26 4 0:51 16:86*** 
OSA 11 11 13 9" 
2&3 AP i7 17 2 11 0:67 7.29 NS 
OSA 12 13 18 9 
2&4 AP 17 17 30 16 0:49 19:95*** 
ОЅА 12 12 8 4 
2&5 АР 17 17 25 12 0-69 6:54 NS 
OSA 12 13 13 9 
3&4 AP 15 16 19 8 0-60 10-27* 
OSA 11 12 18 9 
3&5 AP 16 16 14 4 0:46 22.61*** 
OSA 11 12 23 14 
4&5 AP 15 16 23 9 0:64 8.10* 
OSA 11 12 13 9 


***pc0-05; P**<0-01; *P«0001; NS = Not Significant; AP = Anatomy & 
Physiology; OSA = Other Subject Areas. 
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TABLE 7 


SPECIFICATION GRIDS, QUESTION CLASSIFICATION MATRICES, VALUES OF Су, 
AND x2 FOR EACH COMBINATION OF QUESTIONS IN THE WRITTEN PAPER, 
SHOWING DIFFERENCES FOR SUBJECT AREAS 








Question Specification Question Classi- Sig 
Combination Grid fication Grid С» x? (df = 1) 
1&2 АР 33 45 0-61 10-36** 
OSA 24 12 

1&3 AP 31 26 0-81 1:94 NS 
OSA 22 27 

1&4 АР 30 40 0-64 7.88 ** 
OSA 22 12 

1&5 АР 30 30 1:00 0-00 NS 
OSA 23 22 

2&3 AP 34 32 0:93 0-27 NS 
ОБА 25 27 

2&4 АР 34 46 0:61 10:23** 
OSA 24 12 

2&5 AP 34 37 0-90 0:62 NS 
OSA 25 22 

3&4 AP 31 27 0:85 1:21 NS 
OSA 23 27 

3&5 AP 32 18 0:54 14.64*** 
OSA 23 37 

4&5 AP 31 32 0-97 0:07 NS 
OSA 23 22 





*P«005; ** P<0-01; *** Р < 0:001; NS = Not Significant; AP = Anatomy & 
Physiology; OSA = Other Subject Areas. 


TABLE 8 


SPECIFICATION GRIDS, QUESTION CLASSIFICATION MATRICES, VALUES OF 
Ceo, and x? FOR EACH COMBINATION OF QUESTIONS IN THE WRITTEN PAPER, 
SHOWING DIFFERENCES FOR EDUCATIONAL OBJECTIVES. 








Question Specification Question Classi- Sig 
Combination Grid fication Grid Со х? (df = 1) 
1&2 K2 28 41 0-59 11:86*** 
ОСО? 29 16 

1&3 K 26 40 0-53 14: 79*** 
OCO 27 13 

1&4 K 26 39 0:55 13-00*** 
oco 26 13 

1&5 к 26 39 0-55 13-00*** 
OCO 26 13 

2&3. K 29 39 0-68 678** 
OCO 30 20 

2&4 K 38 0-70 5.58 * 
ОСА 29 20 

2&5 K 27 37 0-67 7-03 ** 
OCO 30 21 

3&4 K 26 37 0-62 8.97** 
ОСО 28 17 

3&5 K 27 37 0:66 7.27** 
ОСО 28 18 

4&5 K 26 36 0-65 7.42 ** 
OCO 28 18 





*P«005; **P<0-01; *** P<0-001; К = Knowledge; ОСО = Other 
Curriculum Objectives. 
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Table 8 is concerned with the data showing the specification for the distribution 
of questions according to the educational objectives tested. In this case all combina- 
tions of questions differ significantly from the specification derived from the consensus 
opinion of teachers. Combination 2 and 4 is the least divergent with C,, = 0-70 but 
which is significant at the 0-05 level. Four combinations (all involving question 1) 
differ at the 0-001 level, with values of C,, ranging from 0-55 to 0:59. It is clear that 
the low values of C,, are due to the examination questions testing the knowledge 
category rather than other educational objectives. 


CONCLUSION 


We can conclude, therefore, that the differences between the specification matrices 
and the question classification matrices are due more to an inadequate testing of the 
educational objectives than of the subject areas. Even so, there are still some 
distortions which are due to inadequacies in testing subject areas in some combinations 
of questions. 


The value of the content validity coefficient С 
Contrary to earlier opinion it has been shown to be possible to quantify content 
validity in a meaningful manner. The coefficient C,, permits: 


(1) The comparison of two, or more examinations with each other. When the 
number of cells in the specification matrices are the same, direct comparison 
is possible with C,,; when the number of cells in each matrix differs, 
comparison is possible through the confidence levels associated with the 
value of у2 used in compilation of С. 

(ii) The comparison of alternative combinations of questions in an examination 
where a choice is given. 

(iii) Examination boards to set limits of tolerance within which an examination 
should conform to the specification matrix, and then to ascertain whether 
it is within the set limits. 

(iv) The search for sources of content invalidity, either in terms of broad 
classifications such as the subject areas or educational objectives (as shown 
in the example discussed earlier in this paper), or to narrow down the search 
to particular areas or objectives within certain questions. 


The Appendix to this paper lists critical values of C,, for tests 25, 50, 75 and 100 items 
in length, for three different levels of significance for matrices up to 5x 5. 
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APPENDIX 
CRITICAL VALUES OF Су» FOR THREE LEVELS OF SIGNIFICANCE, FOR TESTS COMPRISING 25, 50, 75 AND 100 ITEMS 


No. of items in test 








25 50 100 
Size level of significance level of significance level of significance level of significance 
of Е 
matrix 0-05 0:01 0001 0-05 0:01 0-001 0:05 0-01 0:001 0:05 001 0:001 





df 
1 0-63 0:54 0-45 0-73 066 0:58 0.78 0-71 0-64 081 075 069 
3 051 044 037 0-63 057 050 0-69 0-64 0-58 073 0-68 0-62 
2х3 5 0-45 039 0:33 0-57 0:52 046 0:64 0:59 0-54 0-68 064 0:59 
7 0-40 0:35 0-30 0-53 0-48 0-43 060 0:55 0:51 0:65 0-60 0-56 
8 
9 


3х3 0-38 0:33 0:28 0:51 0-46 041 0:58 0-54 0-49 0-63 0:59 0:54 
2х5 036 032 027 050 045 040 057 0:53 0:48 0.62 0:58 0-53 
3х4 il 034 0:29 025 0:47 042 0:38 0:54 0:50 0-46 059 0:55 0.51 
3х5 14 0-30 027 0.23 043 0:39 0:35 0:51 0-47 0-43 0:56 052 0:48 
4х4 15 029 026 022 0:42 0:38 0-34 050 0:46 0-42 0:55 0:52 048 
4x5 19 026 023 020 039 0:35 032 046 0-43 0-39 0:52 048 045 


5x5 24 0-23 020 0:18 035 0:32 0:29 0:43 040 036 0-48 045 042 


The values ої C,, are the lowest which the coefficient can have if the test is to be considered not to depart signifi- 
cantly from its specification, for example, for а 75-item test, based on a 3x3 specification matrix a value of Су» 
lower than 0-58, indicates that the classification matrix shows a difference from the specification matrix at the 
0:05 level. If Ce is lower than 0-54, it shows a difference at the 0-01 level: and lower than 0-49, it is different 
at the 0-001 level. Values higher than 0-058 but below 1-00 indicate that the test differs from its specification 
matrix, but with increasing coincidence as Ce» approaches 1-00. Test compilers may wish to set higher limits to 
values of C,, to ensure greater resemblance between their product and its specification. 
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EXPECTED TIME OF TEST AND THE ACQUISITION OF 
KNOWLEDGE 


Ву С. D'YDEWALLE, MARLEEN DEGRYSE AND E. DE CORTE 
(University of Leuven, Belgium) 


Summary. Subjects from four classes (two classes in physics and two classes in Dutch 
literature) expected to receive either a delayed test (one or two weeks later) or a test 
immediately after their class. For all subjects, the test was given at the same time 
(immediate test). Male subjects expecting an immediate test performed better tban male 
subjects expecting a delayed test. Мо clear findings were obtained with female subjects. 
While the data were inconclusive about a hypothesis derived from information processing 
studies, the absence of interaction effects with various personality measurements and the 
data from the post-experimental questionnaire strongly suggested the inadequacy of the 
goal gradient hypothesis. 


INTRODUCTION 


THE present experiment investigates the effect of the expected time of testing upon 
the acquisition of knowledge. Half of the subjects anticipated a test immediately 
after their class hour, while the other half expected to receive the test one or two 
weeks later. Аз the emphasis of the investigation is on the acquisition, all subjects 
received the test immediately at the end of the class hour. 


Effects of the expected time of testing on learning have not been a favourite 
research topic. А few studies were carried out in the early years of experimental 
psychology. The experiments by Aall (1912, 1913), using various learning materials, 
are well known, but his data and conclusions were not clear at all. The same is true 
for Boswell and Foster (1916) whose subjects learned series of Chinese-English pairs 
of words either for permanent retention or for merely temporary recall. Geyer (1930) 
gave instructions to recall lists of paired-associates at a given time while Thisted 
and Remmers (1932) provided time-sets under ordinary classroom conditions. In 
Zoltobrocki (1961), students learned two rows of nonsense syllables and were told 
that one of the two rows was to be recalled immediately following the learning 
situation and the other one week later. In fact, both rows were to be reproduced at 
the same time. АП the preceding experiments included an immediate and a delayed 
test. The data tend to establish a better performance on the immediate test when it 
was anticipated, but there was less forgetting (as measured in the delayed test) when 
the temporal set with а delayed recall was introduced. However, the differences 
obtained are generally not particularly impressive. 


Recently there has been an upsurge of interest within information processing 
approaches. In the two-stage model of Atkinson and Shiffrin (1968), cognitive pro- 
cesses (control processes) serve to maintain new information in the short-term store 
(STS) and, at the same time, to transfer some of this information into the long-term 
store (LTS). The two functions have a trade-off relationship because both use the 
limited capacity of STS. Bellezza and Walker's (1974) study illustrates nicely this 
trade-off. One group of subjects learned different short lists of words and were 
asked to recall immediately following the presentation of each list. Accordingly, the 
task did not require transfer of information into LTS, although a final recall test 
was unexpectedly given. Another group was told, before being exposed to the lists, 
that they would be given a final recall test. While the first group out-performed 
the second group on the immediate recall, the opposite was found in the final recall, 
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confirming the trade-off hypothesis between the information maintained in STS and 
transferred into LTS. 


Gótz and Jacoby (1974) reformulated the above trade-off hypothesis of STS 
processes as а function of the framework by Craik and Lockhart (1972): subjects 
expecting а delayed recall test are likely to organise items or process them to a 
* deeper ’ level, to ensure that they will still be accessible at the time of the delayed 
recall. Short lists were presented and free recalled either immediately or after a 
period of number subtractions (delayed recall). Subjects were informed either prior 
to the list presentation (pre-cue condition) or after the list presentation (post-cue 
condition) which of the two delay conditions would follow. In the final free recall 
of the pre-cue condition, subjects with number subtraction performed better than 
subjects with no delayed recall. Since delay was not predictable during study 1n the 
post-cue condition, the final recall advantage of the delayed items disappeared. 
Dark and Loftus (1976) were unable to replicate Gótz and Jacoby (1974). However, 
they presented items for study at a substantially faster rate than did Gótz and Jacoby. 
Jacoby et al. (1978) showed that subjects! better preparation for a delayed test (by 
processing the items in a more meaningful, * deeper ' fashion) only occurs when they 
are given sufficient time for processing which is appropriate to the task. 


The time set may be viewed as a variable related to the goal gradient hypothesis, 
stating that the motivation is increased when the goal becomes closer. There are, 
however, problems when applying the goal gradient hypothesis to the present study. 
First, the goal gradient hypothesis has generally not been framed to investigate its 
motivational implications on information acquisition (learning). While an overview 
of studies investigating motivational variables on learning produces a rather confusing 
picture of results (McLaughlin, 1965; Weiner, 1966), it is perhaps clarifying to dis- 
tinguish two possible kinds of effects. One is related to the incentive value of enhancing 
subject's learning activities (incentive function). The other effect is a function of the 
subject's processing of the learning material appropriate for the task (directive 
function) Another problem with the goal gradient hypothesis is related to the 
Yerkes-Dodson Law (Broadhurst, 1959)—higher motivation is not necessarily pro- 
ducing optimal behaviour. Higher motivation can produce both task-relevant and 
task-irrelevant activities interfering with efficient learning. Moreover, the goal 
distance in time influences the arousal of anxieties which may reverse the predic- 
tions from the goal gradient hypothesis. Alpert and Haber (1960) constructed an 
achievement-anxiety scale which took two forms of anxiety into account. While 
facilitating anxiety may raise the general drive level and is assumed to improve per- 
formance, debilitating anxiety is assumed to lead to poor performance. Subjects 
high on a debilitating-anxiety scale may feel the immediate test as more threatening 
than a delayed test and show some learning inhibition. 


METHOD 


The present investigation attempted to collect data under a natural setting. 
Considerable effort was made to keep the subjects unaware of being involved in an 
experiment. The critical test involved two kinds of questions requiring either a 
© superficial ’ knowledge (1.е., а literal reproduction of what has been said during the 
lesson) or а ‘ deeper’ knowledge from information given in the class (requiring the 
application of acquired insights into new problems). A few covariables were used to 
look at interactions between the experimental manipulation (the expected time of 
test) and a few personality variables. School is an important area for display of 
achievement-related activities. Any situation which presents a challenge to achieve 
success also poses the threat of failure. Accordingly, an achievement measurement 
was made involving two anxiety scales, facilitating and debilitating anxiety. It was 
hypothesised that facilitating anxiety would enhance a subject’s performance when the 
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goalis nearby (i.e., an immediate test is expected) while subjects with a high debilitating 
anxiety would show better performance with a more remote goal Achievement 
motivation and the anxiety components are assumed to have mainly an incentive 
function on the acquisition performance. The incentive function is likely to influence 
the performance on the two test parts, the reproduction test (part 1) and the transfer 
of acquired knowledge (part 2). From recent studies on information processing 
(see above), we hypothesise that the expectation of an immediate test enhances 
superficial knowledge while a more remote test goal stimulates a more in-depth 
comprehension and learning (the directive function of motivation on learning). 
According to this directive function, an interaction between the two test parts and 
the experimental manipulation is expected. To investigate structural learning 
activities in more detail, a cognitive style measurement (field-dependency) was included 
in the present experiment. 


Subjects and procedure 

The experiment was carried out in а high school near Brussels (Belgium). A 
few preliminary meetings with the two teachers (one of Dutch and the other of 
physics) were held to prepare the whole experiment. Atthe beginning of the school 
year, the teachers informed the pupils how they would be examined during the 
school year. They said that about every two weeks half of the pupils would receive 
an examination immediately after the lessons and the other half on a later day. In 
the first case, the teacher would warn the pupils just before the lesson. Both teachers 
chose a lesson topic which could be given as a unit during a one-hour class. As to 
the formulation of open-ended questions we suggested that the answers on one series 
of questions (part 1) had to involve some reproduction of information given during 
the class while another series of questions (part 2) had to be directed to transfer of 
given information either to solve new problems or to apply to new issues. Secondly, 
we suggested that the questions should be phrased as they generally were in other 
examinations. The two suggestions were apparently not conflicting as the teachers 
did not experience any difficulties in finding such questions. Four classes were 
included in this experiment: two upper classes (one in physics and one in Dutch) 
and two next-to-upper classes (again in physics and in Dutch). In the following 
discussion, we call the latter ‘lower’ classes. То avoid any communication about 
the experiment between the classes, the two teachers gave the two lessons on the same 
day successively. АЦ verbal interactions in the classroom (between the teacher and 
the pupils) were recorded on a hidden tape-recorder operated behind the teacher's 
desk (a typed version of these recordings is available on request), At the beginning 
of the class, the teacher divided the class in two halves according to the alphabetical 
order of the pupils’ names. One half was told that they would be tested immediately 
after the lesson, while the other half would be tested two weeks later. This procedure 
was quite natural as the subjects were already accustomed to it. Table 1 gives the 
number of subjects in each class and experimental condition. 


At the end of the class, all subjects were given the two test parts and had to 
respond to a post-experimental questionnaire. The post-experimental questions were 
constructed to assess how the subjects experienced the whole experiment. Each 
question was followed by a seven-point scale, with a verbal description of the two 
extremes: (i) did you mainly pay attention to the general structure of the lesson given? 
(ii) did you mainly pay attention to the details in the lesson? (iii) did you try to 
learn by heart? (iv) did you try to associate what has been given with what you 
already knew? (v) did you listen to the teacher as usual? (vi) how interesting was 
the lesson? (vii) did you feel relaxed? (viii) did the announcement of an immediate 
test to some pupils influence your study behaviour? (ix) was there sufficient time to 
answer the test questions? Time to respond to the questions of the test and to the 
post-experimental questionnaire was subject-paced. 
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TABLE 1 
NUMBER OF SUBJECTS IN THE SEVERAL CONDITIONS 


Immediate Test Expected Delayed Test Expected 





Male Female Male Female 
Subjects Subjects Subjects Subjects 
Lower Classes 
Dutch 6 6 5 7 
Physics 8 6 8 7 
Upper Classes 
Dutch 4 4 3 5 
Physics 3 6 4 4 
RESULTS 


Description of the covariables 

From the teachers we received the scores of previous examinations on Dutch 
and physics and we combined these to one score for each subject. We asked the 
local school guidance centre to submit the Group Embedded Figures Test (GEFT; 
Witkin et al., 1971) and the Achievement Motivation Questionnaire of Hermans (for 
an English description, see Hermans, 1970) in addition to the several personality and 
intelligence measurements they usually collect each year from these classes. We 
hypothesised that the previous examination scores should be the best covariable to 
equate the individual performance differences in the several classes. The other 
covariables were taken to look at interactions with the experimental manipulation. 
Cognitive style, as measured by GEFT, may be related to the way subjects try to 
organise the information as a function of the nearness of the test. The Achievement 
Motivation Questionnaire involvestwo anxiety subscales, a facilitating and a debilitating 
anxiety scale. In our introduction, we suggested that subjects high in debilitating 
anxiety may show poor performance when the goal is closer. 


Analysis of variance (ANOVA) with Class, Teachers and Sex of subjects as 
factors shows that there is more debilitating anxiety with female subjects than with 
male subjects (Е (1, 78) = 29-629, P < 0-001). On the other hand, scores on facilitating 
anxiety are higher with male than with female subjects (Е (1, 78) = 7:076, P « 0-01), 
although the effect interacts with ali factors involved in the ANOVA (F (1, 78) — 
5:385, P 0-025). The achievement motivation scores are higher in the upper classes 
than in the lower classes (Е (1, 78) = 10:317, P 0:01). This higher achievement 
motivation in the higher classes was somewhat unexpected, while the pattern of 
differences between facilitating and debilitating anxiety as a function of the sex of 
the subjects has repeatedly been reported by Hermans (1970, 1971, 1976). It is 
generally accepted that female subjects are more field-dependent than male subjects. 
However, this is only marginally confirmed in this study (Е (1, 78) = 3:070, Р < 0-10). 
Subjects in the two classes on physics are more field-independent than subjects in the 
two classes on Dutch (F (1, 78) — 21:603, P «0-001). The two classes on physics 
come from more demanding high school sections than the two classes on Dutch. 
Various intelligence measurements on the four classes point to the same difference 
between the classes on physics and Dutch (as the same intelligence measurements 
were not available for the four classes, no direct comparisons are possible, however). 
The results of the ANOVA on the previous examination scores have no important 
implications (except as covariable in our later analyses) as they are due to idiosyn- 
crasies of the topic and the teachers’ examination and evaluation systems. 
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Performance on the two test parts 

The teachers scored the answers on each test question on a three-point scale. 
These scores were averaged for the two test parts (literal reproduction and insight 
problems) separately. Table 2 gives the average values (expressed in percentage 
correct performance) as a function of the experimental manipulation and sex of 
subjects. 

An ANOVA was carried out involving five factors: Part 1 versus Part 2 (a within- 
subjects variable), the experimental manipulation (an immediate or delayed test 
expected), sex of pupils, class level, and the two teachers. The last two factors were 
included in the ANOVA in order to remove their effects (main effects and interactions) 
from the means squares of error. As can be seen from Table 2, literal reproduction 
(part 1) is much easier than the insight problems (part 2), (Е (1, 70) = 69-069, 
Р <0:001). In some classes, a ceiling effect on the first part of the test is likely to 
have occurred with many subjects showing perfect performance (26 per cent of 
subjects with perfect performance). As the interaction between test parts, experi- 
mental manipulation and sex of subjects was significant (Е (1, 70) = 4109, P<0-05), 
we performed separate ANOVAs on the two test parts involving the same factors 
as in the preceding ANOVA except test parts. On Part 1, no significant effects emerge 
while the experimental manipulation (F (1, 70) = 5-913, P<0-025) and its interaction 
with sex of subjects (Е (1, 70) = 4:243, P « 0-05) are significant on part 2. From 
Table 2, it appears that subjects perform on a higher level of accuracy on test part 2 
when expecting an immediate test than when expecting a delayed test. This is, 
however, only quite clear with male subjects. An a-posteriori Tukey test clears out 
the significant interaction between experimental manipulation and sex of subjects 
as follows: male subjects perform significantly better when the immediate test is 
expected than when the delayed test is expected (t (2, 70) = 5:899, P 0-01); when 
a delayed test is expected, female subjects out-perform the male subjects (t (2, 70) — 
3:249, P < 0:05); all other differences are not significant. 


Next step in the analysis was to see whether the effects of the experimental 
manipulation interact with the several covariables. Therefore, we used the multiple 
regression techniques outlined by Kerlinger and Pedhazur (1973). Several multiple 
regressions were done on each test part separately, each time with one of the covari- 
ables (previous examination scores, GEFT, achievement motivation, facilitating and 
debilitating anxiety). Again, class level and two teachers were included as factors 
in the analyses to obtain a smaller residual error term. Neither the experimental 
manipulation nor its interaction with sex interacts significantly with one of the 
covariables (F c 1, in all cases). Previous test scores are significantly related with 
the performance on test part 1 (Е (1, 57) = 14:467, P «0-001), but not with the 
performance on test part 2 (F« 1; the lack of relationship on test part 2 is not clear 
to us). GEFT correlates significantly with test part 2 (Е (1, 57) = 4934, P «0:05) 
but not with test part 1 (Е« 1). This was to be expected as a relationship between 





TABLE 2 
PERCENTAGE CORRECT PERFORMANCE ON THE TWO TEST PARTS 
Part 1 Part 2 

Male Subjects 

Immediate Test Expected 73:6 54:8 

Delayed Test Expected 75-1 35.5 
Female Subjects 

Immediate Test Expected 70-9 


47 
Delayed Test Expected 65:5 46: 
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field-independency and performance has generally been assumed in tasks where 
solution depends on using an element in a different context from the one in which 
it has been presented (Goodenough, 1976; Witkin et а/., 1977). Only one more 
significant effect emerged from the regression analysis: the interaction between 
facilitating anxiety and the sex of subjects on test part 1 (Е А, 57) = 5:421, P « 0:025). 
Its unravelling would burden our discussion unduly and itis not relevant to the key 
issues of the present experiment. 


The lack of interaction effects between the experimental manipulation and the 
two anxiety scales did surprise us as we predicted that facilitating anxiety should 
improve the performance of the subjects expecting an immediate test while a debili- 
tating anxiety should disrupt optimal performance in the same condition. To be 
sure that the interactions were really absent different kinds of statistics were used. 
The regression analysis was repeated using the subtraction on the Z scores of the 
two anxiety scales as covariable (as performed by Hermans, 1969). From the Yerkes- 
Dodson Law a curvilinear relationship between performance and anxiety has some- 
times been predicted: an increase in anxiety results in improved performance and 
effectiveness up to a point and further increases in anxiety result in decrements in 
performance. However, the individual mapping of the test performances and the 
two anxiety scales (separately or combined), and the regression analysis with trend 
components give no indications of any quadratic relationship in our data. 


ANOVAs were conducted on the data from the post-experimental questionnaire 
to determine which questions reflected significant effects from the experimental 
manipulation and the sex of subjects. Again, the other factors of the preceding 
analyses were included mainly to remove their variances. Only one question provides 
a significant effect: subjects expecting a delayed test consider themselves to be more 
relaxed, (Е (1, 75) = 8:880, Р« 0:01); a multivariate analysis on the nine questions 
confirms the significant effect of the experimental manipulation, (Е (9, 67) = 2-253, 
P«0-05) The data from the nine post-experimental questions were correlated with 
the performance on the two test parts. The same correlational pattern emerges on 
the two test parts and in the several conditions. Two correlations (when pooling the 
two test parts and the several conditions) are significant: subjects performing well on 
the test said to be more relaxed (г = +0300, P « 0:01) and to have received enough 
time to respond the test questions (r — -- 0-267, P « 0-05). 


DISCUSSION 


Evidence presented here indicates that male pupils acquire substantially more 
knowledge when expecting an immediate testthan when expecting a delayed test but the 
effect was only apparent on part 2 of the test. Females do not show any difference 
as a function of the time set. Looking back at the previous publications on this 
topic, no gender effects have been reported, but most studies do not mention the sex 
oftheir subjects. Gótz and Jacoby (1974), and Thisted and Remmers (1932) explicitly 
refer to а constant ratio of males to females across conditions but did not include 
sex as a factor in the description and analysis of their data. 


The better performance when an immediate test was expected has repeatedly 
been observed (see introduction). However, the present research was undertaken with 
the ambition of moving toward a theoretical understanding by including a few 
personality variables. The analysis was guided by hypotheses drawn from the goal 
gradient hypothesis (and its assumed interaction with the two kinds of anxiety) and 
information processing theories. As no interactions between the time set and the 
covariables emerged, we failed to advance our knowledge about the effect of the 
expected time of test. Either our personality measurements did not possess enough 
validity or our hypothesis on their interactions with the time set was faulty. 
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Our general pattern of findings on the covariables is clearly consistent with 
previous publications with a positive correlation between performance on test part 2 
and field-independency, boys showing more facilitating anxiety and females more 
debilitating anxiety. The only exception is the weak gender difference on field- 
independency, while many studies refer to strong sex differences on field-independency. 
"Therefore, we are inclined to believe that our measurements are valid. 


From the sex differences on the two anxiety scales, one could hypothesise that 
males (showing more facilitating anxiety) should do better than females when expecting 
an immediate test, while females (showing more debilitating anxiety) should do better 
than males when expecting a delayed test. This is indeed what has been found 
(see Table 2). But to remain consistent with the hypothesis, females should do less 
well when expecting an immediate test than when expecting a delayed test, This is 
clearly not borne out by the data. 


The post-experimental question on relaxation gives one more indication of the 
inadequacy of the goal gradient hypothesis to explain the better performance of the 
boys when expecting an immediate test rather than expecting a delayed test. AII 
subjects report being more relaxed when expecting a delayed test which is consistent 
with the goal gradient hypothesis. However, there is a significant positive correlation 
(г = +0-300) between relaxation and the performance on the test, that is, the more 
the subjects are reporting being relaxed, the better their performance on the test. 


In Figure 1 are given the mean values and regression lines between the relaxation 
rating and the test performance on test part 2 as a function of the main conditions. 
As the findings with female subjects аге less clear-cut, we omit them from the discussion. 
Looking at Figure 1, it is apparent that the difference between the test performance 
of the males with the two time sets should be increased if the relaxation in the two 
conditions was equated. The goal gradient hypothesis should have predicted a 
disappearing of the difference in test performance when the motivational level, as 
expressed in the data of the relaxation question, was equated. However, one has to 
be careful when using the post-experimental data to interpret the findings. First, 
the post-experimental findings are ratings of subjective experiences. Also, the 
responses on the post-experimental questions are influenced by the performance on 
the two test parts. However, the positive correlation between relaxation and test 
performance is in agreement with the general view that immediate test performance 
1s impaired by high arousal while a delayed test performance is enhanced (Hockey, 
1978). 


The present study was framed within the goal gradient hypothesis. However, 
our findings reject thís hypothesis because no interaction effects between the experi- 
mental manipulation and the anxiety scales were obtained and an equation of the 
motivational level of the subjects should have increased the difference on test per- 
formance of the boys when responding to test part 2 as a function of the expected 
time of test. Мо evidence was obtained that а deeper processing of the learning 
material is stimulated by expecting a delayed test. On the contrary, males showed 
better performance on test 2 when anticipating an immediate test. The explanation 
of this difference is most likely to be found in the males' foresight of more study 
time when a delayed test is expected. 
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ADJUSTMENT OF EXAMINATION MARKS 


Ву J. В. GREEN, С. В. BALDOCK лю М. Е. AL-BAYATTI 


(Department of Computational and Statistical Science and 
Department of Applied Mathematics and Theoretical Physics, 
University of Liverpool) 


SUMMARY. А simple unbalanced block model is proposed for examination marks, аз 
an improvement on the usual implicit model. The new model is applied to some real 
data and is found, by the usual normal linear theory F test, to give a highly significant 
improvement. Some alternative models are also considered. 


INTRODUCTION 


IN a university or other examination it often happens that candidates' results will be 
regarded as comparable even though they have not taken all the same subjects. The 
implicit model (Model 1—complete randomisation plus weighting) for the mark for 
student і and paper j, which is of weight m, (а small integer) is 


У m m; Eip (1) 


where а, is the appropriate mark per unit (weight 1) for candidate i (which may be 
thought of as representing the candidate's average overall ability) and (ej) are in- 
dependent error terms of zero mean. Let д; be the number of candidates taking paper 
j. A direct average of (possibly some of) the marks У, (or Y;,/m,) is usually used to 
indicate candidate Рз overall performance, and often the number of pass marks (or 
other gradings) among the (Y,,) is used too. Ordinarily little, if any, account is 
taken of the different subjects involved. However, sometimes, it will be noticed that 
rather low (or high) marks occurred on one paper, so a lower (or higher) pass mark 
will be used for it. For some examinations it is ensured that approximately constant 
proportions of candidates fall into different grades from one year to another. For 
other examinations, marks are linearly transformed to ensure a previously chosen 
mean and variance. However, many examinations take no formal cognisance of the 
different subjects involved, although it is widely appreciated that it is easier for 
candidates to obtain higher marks in some subjects than in others. While it may be 
argued that not correcting for different subjects merely increases the variance of 
marks, but that overall things should balance out about right, individual candidates 
may suffer an appreciable handicap because they happen to have chosen subjects 
which will turn out to be marked relatively low, and they will receive no compensation. 


Of course, a single overall mark hardly does justice to the full set of a candidate's 
performances. Nevertheless an overall mark is usually required for ranking the 
candidates. It is true that marks in separate subjects are often considered in the 
assessment of candidates, but it might be better if the latter also were adjusted to 
compensate for low or high markings in particular subjects. 


Some different methods of correcting for subject variations have already been 
indicated, but when some subjects are involved for which 4; is small, the use of 
corrections based on subject mean and variance or proportions seems unsatisfactory. 
Alternatively we may think of the correction for a particular paper being multiplicative 
or additive. Some examiners occasionally use these (particularly the former) in an 
informal way. 


We here propose to consider primarily the additive method, using the method 
of least squares, but in applications to data we also compare the multiplicative 
method. Our proposed model (Model 2) is 


Ү = mort P) + Ep (2) 
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where В; is the effect of the jth paper, and we shall also assume that the independent 
21,3 have distribution МО, тус"), and that 


УКВ) = 0, (3) 


where k; = та). Some equation such аз (3) is needed for identifiability. Taking 
the variance as mo? is not essential, but it seems reasonable. It would apply if 
Ү,, were the sum of independent marks for ту single units, each with variance о”. 
Under the assumed model, the method of least squares for estimating the unknown 
parameters implies the usual optimum properties for least squares and for maximum 
likelihood. It also enables us conveniently to test against Model 1 using normal 
linear theory. 


It is not possible to estimate and allow for interaction between candidate i and 
paper j; such interaction must be absorbed into the error term e,,, inflating its 
variance, analogous to the complete randomisation situation. Again, it is possible 
that the variance will depend upon ј. We do not consider this more complicated 
possibility here (it is also implicitly disregarded in the usual methods), but regard 
Model 2 as potentially a great improvement upon Model 1 in representing examination 
marks data. 


Two references to the analysis of an unbalanced blocks design like our Model 2, 
with unequal ks and unequal hs where h; = У рту (where X, means summation 
over all j values that occur with i), but with m, — 1 for all j, are Pearce (1965, pp. 70- 
76), and Scheffé (1959, pp. 112-119). 


The papers of Hasofer (1977) and Backhouse (1978) also use the additive model 
considered here for the adjustment of examination marks, albeit equally weighted 
marks. Hasofer uses it simply as a technique to make a plausible adjustment and to 
estimate the added effects. Backhouse (1978) uses the analysis of variance approach, 
applied earlier by Nuttall ег al. (1974), and separately by Scott (1975) for a specially 
designed experiment by referring to the method for the unbalanced, non-orthogonal 
situation expounded by Kendall and Stuart, Vol П, Chapter 19 (1961) and Vol. Ш, 
Chapter 35 (1966). Using this approach, Backhouse estimates for effects and tests 
for significant differences between them, chiefly between subject and GCE boards, but 
also between sexes (as also in the work of Forrest, 1971, and Forrest and Smith, 
1972), and between subject groupings. The additive methods in these papers are 
essentially the same as each other and as that used here, except that the latter works 
with unequal weights, appropriate to the data being investigated; also the present 
application is to marks rather than to a small set of grades, and so is likely to con- 
form better to the assumption of normality. 

Backhouse also discusses some objections suggested by others to the proposed 
additive model (which is like our equation (2) with т; = 1). He concedes that the 
effect of paper j (analogous to our — fj) for each J, which he calls the " severity " 
of marking of paper j, may well not фе indicating marking severity, at least not 
altogether. While he allows some grounds for some of the criticisms, he still concludes 
that the analysis affords a useful practical indication of the true situation. 


In an earlier paper, Backhouse (1976) compares three different methods of ranking 
a set of one group of CSE students who all took two certain examination papers, and 
one group of GCE students who all took two certain examination papers, one of 
which was one of the papers taken by the CSE group. The different methods are 
compared with the Board's ranking and discrepancies are discussed. However, the 
three methods required there to be one paper (at least) common to all the students, 
while the other methods discussed above have no such requirement, and are therefore 
more generally applicable. 
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METHODS FOR ESTIMATION OF PARAMETERS 
An additive method 
We minimise, subject to equation (3), the weighted sum of squares: 
S= УЖ Ку то +В ИУ [m;-- АХВ, (4) 


where A is the Lagrangian multiplier, and 22 represents summation of all cases where 
i and j occur together, so that 22 = ХХ ку. It is easily found that А = 0 and 


Уут) = 0, (5) 

Ejo( Y m+} = 0. (6 

der pe A; = Lyn Viys В, = Lan Vip G = LLY yj, N= Lim), we may rewrite (5) 
an 

By = ВК Харві» (7) 

= Ai h — E om D jj hs. (8) 


We could make the situation more nearly symmetrical regarding as and fis by adding 
тур to the right-hand side of equation (2) and putting Zh, = 0, when д = G/N, 
and (7) and (8) still hold. However, «s as defined above are more directly relevant 
to us. 


It can be shown that equations (7), (8) and (3) possess a unique solution. It 
seems easiest to solve them iteratively in some such way as the following: 


= Ав 
Ву = Вік, 
ћу = Bos Хо!) 
Bi = В. Gu М 
& = А У jam fs ће 
Generally 
В, = Во, Xii - ula; 
Ву = Вуз СВ ЈУ (9) 
да = Ogi —Lyqqm By fh, forr = 2, 3,..., 
where М is the total number of units taken = ХК). 


The process may be continued until, say, the relative reduction in the sum of 
squares S is sufficiently small. 


Mark adjustment based on (5) and (6) has been practised by a number of 
examining authorities, but without the statistical justification of the foregoing analysis. 
The adjustment principle has been to arrange for the mean mark of each paper to be 
shifted so that it becomes equal to the mean estimated ability of the candidates taking 
that paper. Equation (8) can be regarded as a formula for the estimated ability of 
candidate i and (5) expresses the adjustment principle. The principle has been extended 
(more controversially) to the adjustment of standard deviations (Hasofer, 1977). 


Multiplicative methods 
third possible model for the ideal performance of candidate i on paper j is 
Е(Ү,) = myxf,. The method of least squares could again be followed, but in this 
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case it does not imply equality between the total >, РВ, of the adjusted marks of 
candidate і and his estimated overall performance Лос. equation (8)). А method 
which preserves this feature would be more acceptable; two such procedures can be 
proposed by analogy from equations (5) and (6). 


Model 3 (1) Scaling of the papers 


2 (= -та) = 0, (10) 
p G -та) = 0. ' aD 


This has the advantage of leading to a system of linear equations for the quan- 


tities B; which can be shown to possess a unique solution when subjected to a suitable 
normalisation rule. The natural normalisation procedure is to scale all the В, by a 
constant multiple so as to satisfy 


В, 
This also ensures that 
Xj, = YXY, (13) 


Equations (10), (11), (12) are perhaps most conveniently solved iteratively. 
Model 3 (ii) Symmetric estimation of the parameters 


Ei- В 
25 e т) = 0, (14) 


> (5. -та) = 0. (15) 
Jo) J 


This shares with the additive method the advantage of symmetry between treat- 
ment and subject. Normalisation can again be effected by means of (12). 


Equations (14), (15) and (12) may be solved iteratively by a process similar to 
(9), which seems to converge fairly rapidly when applied to data, but no proof of the 
existence or uniqueness of the solution has been found. 


APPLICATION TO DATA 


The least squares method and the multiplicative method (Model 3(i)) were 
applied to two sets of examination marks, departmental (within one department), 
and faculty (within a faculty, involving marks from a number of departments). 


Table 1 shows the original departmental marks and Table 2 the original and 
adjusted student percentages according to the linear least squares and the multi- 
plicative methods. There were 12 students and 25 papers. In this case all the students 
took the same number of units, namely 12. The changes in the marks, and in the 
overall student averages, are not so alarmingly large as to indicate a drastic, un- 
predictable, uncontrollable transformation, nor yet too small to indicate that the 
exercise is not worth the bother. Some students would have their gradings changed 
by these adjustments. The same applies to the faculty data, but there were too many 
of those to conveniently include a table like Table 1, there being 42 students and 46 
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TABLE 1 
ORIGINAL MARKS (DEPARTMENTAL) 





2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2 





а а а ~ 
Мәке О со а лл Бо м | > а 
— 


1 2 1 1i 2 2 1 2 1 1 1 1 £ 1 1 1 2 2 2 2 1 1 2 
36 А 48 71 26 28 33 24 33 53 
29 31 20 23 20 20 17 28 47 
27 44 53 33 28 30 33 36 51 
49 52 33 18 24 17 45 48 

32 61 32 33 33 26 11 
6 0 26 43 18 32 19 19 24 18 
27 30 34 50 34 29 25 31 33 68 
13 13 21 22 22 24 24 15 30 64 

21 23 21 40 25 16 28 33 9: 
27 77 61 28 33 92 29 
31 23 60 21 29 25 24 28 17 57 
21 17 20 16 10 21 19 18 41 

TABLE 2 
PARAMETER ESTIMATES 


Student Averages (24) 6 


Тог] Моде! 1 Model 2 Model 3 Model 2 Model 3 








1 58:7 62-7 62-4 —4 0-838 
2 39.2 42-7 410 0 0-993 
3 55-8 59-4 59.3 5 1145 
4 49:3 57. 56:5 -б 0-716 
5 622 552 544 3 1-109 
6 342 33:8 33:5 -3 0:842 
7 60-2 59-4 602 —6 0-787 
8 41:3 40-5 40-5 6 1275 
9 50:2 44-6 445 —3 0-897 
10 68-2 64.2 65-9 0 1-003 
11 52.5 51:6 52:5 -4 0-860 
12 30.5 31-4 315 2 1070 
13 4 1:179 
14 -5 0-833 
15 1 1-022 
16 -2 0-888 
17 3 1:095 
18 -5 0-666 
19 14 1:394 
20 —6 0-800 
21 -5 0.854 
22 3 1:125 
23 3 1-095 
24 1. 1-048 
25 10 1-419 





papers. Here there were unequal numbers of units per student. А particular 
examining body may wish to assess each student on the best и units, for some chosen 
value of u, and perhaps to consider how many units were passed, using adjusted 
marks, by each student. These policies can easily be catered for, 

In Tables 1 and 2 we show the original mark (out of 50 per unit) for each student 


and paper and the average marks under Models 1, 2 and 3, expressed as percentages. 
These are twice the raw average for Model 1, and 24 for Models 2 and 3. 
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А convenient indicator of the efficiency of a method of adjustment is the residual 
mean square when applied to data. The mean squares for the two sets of data and 
Models 1, 2 and 30) are shown in Table 3. We сап perform a simple F test to compare 
Models 1 and 2, using normal linear theory. 





TABLE 3 
MEAN SQUARES, ETC. ABOUT THE FITTED MODELS 
Departmental Faculty 


df Sumofsqs Meansq df Sumofsgs Meansq 


Model 1 132 11166960 845-98 316 39330900 1244652 
Model2 107 3167-77 29-61 270 13368.20 49.51 
Model3 107 2841-02 26:55 270 24418-94 90-44 





We have for departmental and faculty data respectively 


po. 1085018 
25,107 < 29.6053 x 25 


зла со 9919808 
46,260 = 49.5118 x 46 


As we might expect, these are both very highly significant, giving strong evidence 
of very definite differences between marks obtained on different papers, and of the 
superiority of Model 2 to Model 1 in representing the data. 


The mean squares for Model 3 (ii) are also considerably smaller than those for 
Model 1, but slightly larger than those for Model 2. Model 3 (i) yields a slightly 
lower mean square than Model 2 for the departmental data МОДЕ. it effects some 
large adjustments in some papers with very few candidates), but the mean square for 
the faculty data is nearly double that of Model2. Itis not convenient to test between 
Models 2, 3 (i) and 3 (ii), but there appears to be no clear difference between their 
accuracies (comparing mean squares). 


=н 146:6»2:41 == 5,107 (01 9), 


= 1721.0» 191 = 46,270 (0-1 %). 


CONCLUSION 


Both sets of data examined, department and faculty, reveal strong evidence of 
real differences between the marks for different papers, which renders the usual 
assessment of а candidate's performance on the basis of his average mark unsatis- 
factory and possibly unfair when different students take different combinations of 
papers. The authors believe that this would generally be found to be so. Of the 
two alternative approaches examined, the additive one (corresponding to Model 2) 
seems possibly to accord better with our data, though there seems to be little to choose 
between it and Model 3. It is not clear why many examiners prefer multiplicative to 
additive adjustment. Another factor, which we do not discuss here, is that multiplying 
introduces an artificial element into the relative discriminating power of each paper. 
It would seem that, if an adjustment scheme is required, Model 2 is to be preferred. 


Both adjustment schemes considered have retained the same overall average 
as in the original data. This is quite appropriate if one considers that the variation 
of general level of marks from one year to another reflects a genuine variation in 
student ability. If, however, one believed the between-years variation of average 
mark reflected rather variations in the stiffness of all that is involved in the teaching 
and examining process, he may wish to stabilise this by adjusting the overall mean to 
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some previously chosen value, say 50 per cent. This can easily be done. Alter- 
natively, one may wish instead to choose in advance two approximate percentiles of 
the distribution of the final assessments. This, too, can easily be achieved by an 
appropriate linear transformation of their adjusted marks. 
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HIGHLY ANXIOUS PUPILS IN FORMAL AND INFORMAL 
PRIMARY CLASSROOMS; THE RELATIONSHIP BETWEEN 
INFERRED COPING STRATEGIES AND: I—COGNITIVE 
ATTAINMENT 


By BARBARA E. WADE 
(Department of Social Administration, The London School of Economics) 


SuMMARY. This study was carried out in conjunction with tbe Teaching Styles project 
(Bennett, 1976). Questionnaire measures of " anxiety ' and * achievement motivation ", 
together with cognitive tests of English, mathematics and reading were administered to 
a sample of 956 primary school pupils in classes taught by teachers representing formal, 
informal and mixed teaching styles at the beginning and towards the end of the final 
year in primary school. Based on the premise that self-report measures of ‘ anxiety’ and 
‘achievement motivation’ may be indicative of coping strategies, zonal analyses of 
post-test data were carried out separately by sex and within teaching style. Results 
showed higher levels of attainment for highly anxious highly motivated pupils (an 
inferred coping strategy of approach) than for highly anxious low motivated pupils (an 
inferred coping strategy of avoidance). Further analyses indicated a close link between 
anxiety, motivation and ability for girls, a finding interpreted in terms of sex-related 
conformity, but for boys considerable discrepancies in attainment in relation to the 
inferred coping strategies were still in evidence at the lower ability level under all 
teaching styles. At the upper ability level significant discrepancies in attainment were 
shown only for boys in formal classrooms, where an inferred coping strategy of 
* approach ' was found to be associated with the highest level of attainment and an 
inferred coping strategy of " avoidance’ was associated with the lowest level of attain- 
ment. These findings are interpreted in terms of extrinsic motivational incentives, 
success and failure experience and possible parental influence. Further analysis 
indicated that * avoiders' tended to be introverted and ‘ approachers ° extraverted in 
personality. This finding is interpreted in the light of Gray’s 1970 postulate that 
introverts are relatively more susceptible to punishment/non-reward, whereas extraverts 
are relatively more susceptible to reward. 


INTRODUCTION 


IN the past there has been much speculation about the relationship between instability 
and attainment. Research findings have tended to show low negative relationships 
at primary school level although there are indications of an age-related change which 
takes place in the later years of secondary schooling (Entwistle, 1973). Rather less 
research has been conducted into the relationship of achievement motivation to 
attainment, although studies utilising self-report measures of this variable with 
secondary school samples comprising both sexes (Reiter, 1964; Furst, 1966; Entwistle, 
1968; Kestenbaum and Weiner, 1970) would seem to indicate that it is equally worthy 
ofconsideration. In addition there are indications that level of expressed achievement 
motivation may have some bearing on cognitive outcome relative to level of anxiety 
(Finlayson, 1972; Entwistle, 1973). 

Atkinson (1964) defines motives as dispositional tendencies which tend to be 
latent and are activated by situational cues. His theory of achievement motivation 
makes the assumption that all individuals have both a motive to achieve success and 
a motive to avoid failure. This implies that for any individual in the performance 
of a task there is an approach-avoidance or excitation-inhibition conflict. Also 
emphasising the situational determinants of drive Mandler and Sarason (1952) 
suggest that coping tendencies in response to anxiety are of two main types: those 
which are manifested in avoidance behaviour with attempts to leave the task situation, 
and those which are manifested in approach behaviour and which lead to the D | 
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of anxiety through task completion. Thus, if coping strategies mediate between 
anxiety and behaviour, both positive and negative outcomes in terms of task рег- 
formance might be expected from highly anxious individuals. For example, in the 
classroom situation highly anxious * approachers ' might be expected to apply them- 
selves more diligently and consequently show a higher level of achievement than 
highly anxious ‘ avoiders’, who might be expected to seize every opportunity of 
evading the task and in consequence achieve less. The problem in undertaking 
research into the differential effects of anxiety upon attainment would therefore 
seem to rest upon the identification of those adopting different coping strategies. 


Previously, attempts to assess defensive mode have been made using scales 
developed to assess the tendency to "Ше" or to ‘ deny’ (Zimbardo et al., 1963; НШ 
and Sarason, 1966, and O'Reil and Wightman, 1971). Two assumptions are 
involved in the use of these scales: 


(1) That an avoidance strategy is manifested by denial of ‘ anxiety ' together with 
high * defensiveness ' or ‘lie’ scores. 
(2) That the tendency to ‘lie’ is consistent across situations. 


If the first assumption is correct then it is rather strange that some children should 
obtain both high ‘anxiety’ and ‘defensiveness’ ог ‘lie’ scores. The second 
assumption is questionable since the celebrated study of Hartshorne and May (1928) 
showed that children who practise deceit in one situation do not necessarily do so 
in another. 


An alternative approach is that of Alpert and Haber (1960) who devised the 
Achievement Anxiety Test. Based on the view that increased specificity of scales 
enhances predictive capacity, the AAT is comprised of items designed to elicit 
responses enabling assessment of the extent to which anxiety facilitates or debilitates 
performance in examinations or tests. Whilst this approach is of great interest in 
view of the extensive literature on defensive mode, by emphasising the extrinsic 
interactive aspects of subjects and situations the construct of ‘ anxiety’ would seem 
almost redundant. 


A review of empirical research (Wade, 1979) gave rise to a strong impression that 
scores on self-report measures that refer specifically to the relevant situation may 
provide an indication of the coping strategies likely to be adopted in that situation. 
This impression was gained largely from two sources, the first being recognition of the 
anomaly suggested by a high score on a measure of " test anxiety ' together with а low 
score on a measure of ‘ need achievement’ (anxiety about evaluation together with 
indifference to success). The second source stemmed from a suggestion by Finlayson 
(1972) following a study investigating the relationship between a projective measure 
of ‘n.ach’ and ‘ neuroticism " to a self-report measure of ‘ achievement motivation ' 
following experience of success or failure. This suggestion was to the effect that 
high drive boys experienced conflict when their expectations of success were not 
confirmed and that this resulted in denial of motivation on a self-report measure: 
an interpretation in keeping with previous research which has failed to demonstrate 
a relationship between projective and self-report measures of ‘ achievement motiva- 
tion’. This line of thought suggested that a high score on a self-report measure of 
‘achievement motivation’ together with a high score on a self-report measure of 
‘anxiety’ might indicate a coping strategy of approach. Conversely a coping 
strategy of avoidance might be identified by high ‘anxiety’ together with low 
* achievement motivation ’. 

Accordingly, on the basis of coping strategies to be inferred in the manner 
described, this study investigates the hypothesis that pupils with high ‘anxiety’ and 
high ‘ achievement motivation’ will attain more than pupils with high ‘anxiety’ and 
low ‘ achievement motivation °. 
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DESIGN OF THE INVESTIGATION 

Subjects 

The sample comprised 481 girls and 475 boys, all of whom were fourth year 
primary school pupils. Three hundred and twenty-six pupils were in classes taught 
by 12 teachers classified * formal' with regard to teaching style, 332 pupils were in 
classes taught by 13 teachers classified ‘informal’ and 298 pupils were in classes 
taught by 12 teachers representing ‘ mixed’ teaching methods. Teaching style was 
defined by means of cluster analysis of teacher responses to a questionnaire (Bennett 
and Jordan, 1975). Teachers designated ‘ formal" tended to teach subjects separately 
by way of class teaching and individual work, test frequently, give grades or marks 
and curb movement and talk. Teachers designated ‘ informal ' favoured an integrated 
approach by way of individual or group work, rarely gave tests, grades or marks and 
placed little restriction on movement or talk. Teachers representing * mixed ' teaching 
styles comprised a heterogenous group with differing emphases. Attrition of the 
original sample of 1,150 pupils was rather large due to change of school or absence 
on one of the numerous testing occasions. 


Measures of anxiety and achievement motivation 

With the exception of the Test Anxiety Scale for Children (Sarason et al., 1958) 
existing measures designed to tap the dimensions of anxiety consist of items which 
are either devoid of situational referents or which contain referents to several different 
situations. These scales are assumed to assess a generalised tendency to worry. 
Similarly the most commonly utilised test for the assessment of ‘ need achievement ° 
has been a projective one derived from the Thematic Apperception Test (Murray, 
1938) by McClelland (1969). This test is likewise assumed to assess ‘n.ach’ as a 
generalised characteristic. It appeared that there was а need for reliable self-report 
measures, permitting ease of administration and scoring, that would tap differing 
aspects of ‘achievement motivation’ and aspects of ‘anxiety’ other than those 
relating solely to evaluation. In addition to these requirements the measures should 
contain referents to the classroom situation, to enable the identification of possible 
interaction effects in their use within the Teaching Styles project. 


Accordingly three pilot studies were carried out and new scales were devised 
using principal components analysis. Test-retest reliability for the ‘ Anxiety ° Scale 
over a 2-week period was 0-79 (М = 138) and 0-65 for the present sample with an 
interval of 8 months between testing. Test-retest reliability for the ‘ Motivation’ 
Scale was 0-71 over a 2-week period and 0-57 with an interval of 8 months between 
testing. Coefficients of internal consistency were obtained from post-test data in this 
study; these were 0-9 for the Anxiety Scale and 0:78 for the Motivation Scale 
(Wade, 1979). 


Procedure 

As part of the Teaching Styles project three cognitive measures, the с за 
Reading Test, Stage 3 (Moray House), Ше English Progress Test D3 (NFER) and t 
Mathematics Attainment Test DE2 (NFER), were administered to the sample at 
the end of the third year and in May of the fourth year in primary school. The 
Anxiety and Motivation scales were given to the sample at the beginning of the final 
year and were readministered the following May. 


RESULTS 


The findings of Eysenck and Cookson (1969) and Entwistle and Cunningham 
(1968) indicate the importance of undertaking analyses separately by sex. In this 
study the within sex distributions were used when creating groups for zonal analysis. 
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In this way the assumption that identical scores for boys and girls are indicative of 
the same levels of * anxiety’ or ‘ achievement motivation ' was avoided. Since the 
analyses were to be undertaken in relation to teaching style post-test data were 
utilised. 

Scores on the Anxiety and Motivation scales were split at the 33rd and 66th 
percentiles, thus creating nine groups for boys and nine groups for girls. Means 
and standard deviations for attainment in English, mathematics and reading were 
computed for each group within teaching style. А comparison of the attainment 
scores of pupils with high ‘anxiety’ and low ‘ motivation’ (avoiders) and pupils with 
high ‘ anxiety’ and high ‘ motivation’ (approachers) with t tests to ascertain the 
significance of differences is given in Table 1. 

The results obtained appear to be in line with what would be expected on the 
basis of the inferred coping strategies of approach and avoidance, HAHM pupils 
showing higher levels of attainment in all subject areas than HALM pupils. However, 
variations in the size of the discrepancies appear to be related to both sex and teaching 
style. Under a formal teaching style a high level of ‘ anxiety’ coupled with a high 
level of * motivation’ would appear to be conducive to a high level of attainment for 
boys. This would also seem to be the case for boys under an informal teaching style 
but the difference is less marked. Differences in attainment between the two groups 
of boys under the heterogenous group of teachers comprising the ' mixed’ teaching 
style are much smaller and fail to reach the 0-05 level of significance. The position 
with regard to girls is less pronounced and significant differences are shown only 
for attainment in English by girls in formal and informal classrooms and for attain- 
ment in mathematics and reading by girls under the mixed teaching style. 


TABLE 1 
POST-TEST ATTAINMENT SCORES ВУ SEX AND WITHIN TEACHING STYLE тов Шон Амхівту Шон 
MOTIVATION PUPILS (APPROACHERS) AND Higa Амхівту Low MOTIVATION PUPILS (AVOIDERS) 
High Anxiety High Anxiety 
High Motivation Low Motivation 





Mean SD Mean SD P 
Subjects N= 23 N=7 Difference t (1-tailed) 
Formal English 115-0 9.78 1069 9:56 9-0 1:83 0-05 
Girls Maths 108-6 15:38 99-4 8-47 9:2 1:67 NS 
Reading 109-7 10-74 101-3 13 52 84 1-63 NS 
Informal English 106-7 10:37 97:6 13-37 94 1.92 0-05 
Girls Maths 97.7 10-55 92-6 8:82 54 0-99 NS 
Reading 102 S ae 944 P T9 1:55 NS 
Mixed English 110-0 14:34 104-0 15:76 6-0 134 NS 
Girls Maths 103-9 12:55 93:8 15:88 10-1 2:10 0-05 
Reading 112-1 gree 595 5“ 12:8 2:59 0-01 
Formal English 415-5 12-71 97-9 11-24 17-6 3-67 0:01 
Воув Maths 113:5 1547 934 1425 20-1 3-46 0-01 
Reading 1161 13:03 93.9 13:13 22:2 4-30 0-01 
N = 24 М <= 12 
Informal English 107.5 10.83 93:4 1458 14-1 397 0-01 
Воуз Maths 105.9 10-81 96-4 16:81 9.5 1:95 0-05 ` 
Reading 107-1 11:51 98:3 12:83 8-8 1:82 005 
N= 12 Мей 
Mixed English 98:3 9.42 94-7 11-08 3:6 0-71 NS 
Boys Maths 95:8 16:01 88.2 9.13 76 134 NS 
Reading 101-8 14-52 93-3 11-02 8-5 1-45 NS 
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The question remains as to what extent measures of ‘ anxiety ' and ‘ motivation’ 
are reflective of differences in ability level. Accordingly further analyses were 
undertaken with ability as an extra variable. 


In order to retain adequate group size, median splits were made of scores on 
the Anxiety and Motivation scales utilising the within sex distributions. An average 
cognitive score, derived from the mean of standard scores of pre-test attainment in 
English, mathematics and reading, was taken as an indication of ability and a median 
split was made thus creating eight groups for boys and eight groups for girls. Com- 
parisons of mean post-test attainment scores of 'approachers' (HAHM) and 
© avoiders ° (HALM) are given in Table 2. Тһе mean scores of the other groups are 
also included. 

Although the discrepancies in attainment are still in the predicted direction, 
among the girls there would seem to be little difference in attainment between HAHM 
and HALM pupils within ability level in relation to any teaching style. For boys 
there are significant differences in attainment in all types of classroom at the lower 
ability level. At the upper ability level significant differences in attainment can be 
seen only in relation to formal teaching methods. The possibility that these results 
could be explained by reference to level of ‘achievement motivation’ alone would 
appear to be discounted by the finding that, among the boys, both high and low 
ability ‘ approachers" evidenced higher attainment than any of the other groups 
under а formal teaching style. 


DISCUSSION 


Since the second set of analyses was carried out using median splits on the 
Anxiety and Motivation scales, less discrimination in attainment levels between 
groups might be expected. With the reservation that the indicator of ability utilised 
in this study was that of pre-test attainment, a stringent measure which is also indicative 
of skill in school work, these data strongly suggest close links between levels of 
* motivation? and ability for girls. It may be that * motivational’ pattern is less 
relevant for girls than for boys. If girls are more conformist, their affective tendencies 
may be overridden by situational demands. Marlowe and Gergen (1969, p. 617) 
state: 

6... for women, culturally determined role expectations tend to prescribe 

passivity and compliance as sex-appropriate behaviour, whereas the male sex 

role, with its emphasis on initiative, success and achievement, bears less directly 
on the domain of conformity behaviour." 


Studies by Janis (1954) and Goldberg et al. (1954) were interpreted as indicating 
a negative relationship between anxiety and conformity for males. А study by 
Vaughan (1964) suggested that conformity is positively related to anxiety among 
females. If this is the case then, given high levels of anxiety, little difference would 
be expected in cognitive performance for females even though levels of expressed 
* achievement motivation ' were to differ, but for males a high level of anxiety might 
well have a detrimental influence on cognitive performance unless moderated by a 
high level of * achievement motivation '. 


Under informal and mixed teaching styles * motivation" pattern would seem to 
be closely linked with attainment only for boys of lower ability and the highest level 
of attainment is evidenced by the LAHM groups; a finding in keeping with Atkinson's 
(1964) theory of motivation. In accordance with the hypothesis relating to coping 
strategies HAHM pupils (approachers) show a significantly higher level of attainment 
than HALM pupils (avoiders). Under a formal teaching style ‘ motivational’ 
pattern would appear to be highly relevant to level of attainment for boys regardless 
of ability level, НАНМ boys showing the highest level of cognitive performance. 
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Post hoc speculation leads one to suppose that this might be due to the incremental 
effect of extrinsic incentives in a highly structured situation, together with the more 
competitive nature of the motivation pattern of boys as opposed to that of girls 
(Wade, 1979). This interpretation is seemingly in accordance with that of Atkinson 
(1964) who suggests that high performance, when the tendency to approach success 
(need achievement) is of equal strength to the tendency to avoid failure (anxiety), can 

be attributed to the influence of extrinsic motives and incentives. However, examin- 
ation of attainment scores for LALM boys, who also fit this description, shows that 
at the lower level of ability these pupils have a low level of attainment. It would 
seem that there are at least two possible explanations of this phenomenon: either 


(1) The presence of extrinsic motivational incentives has a differential effect on 
attainment according to levels of motivation, anxiety, ability and sex. (The 
importance of moderator variables or trait interaction in prediction Баз been 
stressed by Saunders, 1956, and Argyle and Little 1972, but the Atkinson 
formulation makes no allowance for interaction.) 


Or (2) Additional extrinsic incentives stemming from outside the classroom, such 
as parental ‘ spurring ', also mediate between ability and attainment. 


With regard to the first alternative, it may be that the use of such incentives by 
formal teachers is not uniform. If stars, good marks, ог words of praise are given 
selectively, then this might serve to polarise differences and change the resulting 
tendency of those scoring high or low on both motives. The second alternative will 
be discussed in a further paper reporting the results of investigations into the class- 
room behaviour and home backgrounds of a 10 per cent subsample of these same 
pupils. 


The possibility that * avoiders ’ might be denied access to reward is deserving of 
further consideration. Gray’s (1970, 1971) proposed modification of Eysenck’s 
(1967) theory of extraversion is relevant in this context. Gray suggests that an increas- 
ing degree of emotionality or neuroticism represents an increasing degree of sensitivity 
to both reward and punishment, whereas an increasing degree of introversion represents 
an increasing degree of sensitivity to punishment or non-reward alone. These relation- 
ships are illustrated in Figure 1. According to Figure 1 the introverted neurotic 
would be highly sensitive to punishment whereas the extraverted neurotic would be 
more susceptible to reward. 


With the reservation that the measure of " anxiety ' used in this study is a measure 
of state ‘ anxiety ° related to the classroom situation and cannot therefore be directly 
equated with the trait measure of * neuroticism ' devised by Eysenck, but bearing in 
mind that those high in trait " anxiety ' tend to be high in state * anxiety ' and that the 
mean ‘ neuroticism’ score of 12:0 for * avoiders’ is greater than the mean for the 
whole sample (9-04), it would seem that investigation of the extraversion scores of 
* approachers' and ' avoiders ° is indicated. Table 3 gives the extraversion scores for 
these two groups of pupils for whom the attainment scores are given in Table 1. 


The mean extraversion score of ‘ avoiders' can be seen to be consistently lower 
than that of * approachers' and is considerably lower in five out of the six com- 
parisons made. This finding fits neatly with the fear-frustration hypothesis proposed 
by Gray (1971), which draws a parallel between non-reward and punishment, and 
with his suggestion that introverts are more susceptible to punishment and non- 
reward whereas extraverts are more susceptible to reward. 


Sensitivity to punishment or non-reward alone could be conceptualised as ‘ fear 
of failure’ whereas sensitivity to reward could be conceptualised as ‘fear of not 
succeeding’. Since sex differences have been found to be all important apropos the 
* fear of success ' found by Horner (1968) in women, it may be that this distinction 
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Figure 1 


Prorosep RELATIONSHIPS OF (а) SUSCEPTIBILITY TO PUNISHMENT TO (Б) THE DIMENSIONS OF INTRO- 
VERSION-EXTRAVERSION AND NEUROTICISM. THE DIMENSION OF ANXIETY (DIAGONAL LINE) REPRE- 
SENTS THE STEEPEST КАТЕ OF INCREASE IN SUSCEPTIBILITY TO PUNISHMENT. After Gray, 1970. 


NEUROTIC 


susceptibility 
to reward 


| 


susceptibility 
to punishment 


И 





TABLE 3 


EXTRAVERSION 5СОКЕЗ (ТРО) or Нісн Амхтту, Шон MOTIVATION AND HIGH ANXIETY, Low 
MOTIVATION Boys AND GIRLS WITHIN TEACHING STYLE 








HAHM HALM 
(Approachers) (Avoiders) 
Teaching — ———— 
Style Sex N M SD N M SD Discrepancy 
Formal Girls 23 19-04 2:68 7 18:43 3.99 0-61 
Boys 24 20-00 2:81 10 14:60 4-88 5-40 
Informal Girls 25 19-90 2:64 11 15'36 2:96 454 
Boys 24 20-30 2:71 12 17.90 2:29 2:40 
Mixed Girls 16 19-31 3-10 13 15-77 454 3:54 
12 19:75 3:24 11 16:55 3:37 3-20 
All 124 19-75 2:85 64 16:34 3-97 341 


will be more important for boys and will be subject to cultural variation for girls 
depending on the extent to which the feminine role is emphasised. 


Further examination of Table 3 reveals that the standard deviations of the 
extraversion scores of ‘ avoiders’ are, for the most part, larger than those of ‘ ap- 
proachers'. Since, although the standard error would be likely to be greater for 
small samples, there are no a priori reasons for standard deviations to be consistently 
greater for smaller samples, these data merit further consideration. Compared to the 
standard deviation of extraversion for the whole sample the standard deviations for 
* approachers ° are somewhat smaller which would seem to indicate that these pupils 
are generally extraverted in personality. The standard deviations of extraversion 
scores for * avoiders ' tend to be larger which indicates а wider spread on this dimension. 
The supposition, based on Gray's theory, that ‘ avoiders' tend to be more sensitive 
to punishment alone cannot therefore be applicable to all these pupils. An alternative 
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explanation is required. Eysenck's (1967) theory of introversion and neuroticism and 
the relation of these two dimensions of personality suggests that whereas the intro- 
verted neurotic tends to the dysthymic neuroses the extraverted neurotic tends to 
psychopathic behaviour. This would seem to imply that avoidance behaviour may 
take different forms; a point which will be taken up in the descriptions of pupil 
behaviour in the second paper. 


Reconsideration of previous research in the light of the present findings leads 
to further speculation. Positive correlations between extraversion and scholastic 
attainment at the primary school level have been obtained in several other studies 
(Banks 1964; Savage, 1966; Rushton, 1968; Eysenck and Cookson, 1969). In this 
study zonal analysis has indicated that HALM pupils are predominantly introverted 
in personality and that there is greater polarisation in attainment between HALM 
and HAHM pupils for boys under a formal teaching style. Rushton’s (1969) finding 
that extraversion scores were subject to considerable change is therefore of direct 
relevance since if the measures utilised in this study are indeed indicative of coping 
strategy it may well be that changes in strategy could occur in conjunction with: 


(а changes in schooling, as for instance primary to secondary; 
b) maturation (suggested by Seddon, 1977); 

(c) success or failure experience (suggested by Finlayson, 1972); 

(d) any combination of (a), (Р) or (c). 

In support of ‘с’, Gray (1971) has demonstrated that the brain mechanisms which 
mediate immediate responses to signals of punishment/non-reward also mediate 
the development of persistence when subjects receive intermittent reward and 
punishment/non-reward. If as he suggests, these mechanisms are more active in 
neurotic introverts then they should show either excessive avoidance or excessive 


persistence. In support of * b? Gray also suggests that, developmentally, persistence 
should follow avoidance. 


Findings by Gaudry and Fitzgerald (1971) at secondary level and Spielberger 
(1962) at the tertiary level of education have suggested that * anxiety ' may be positively 
related to attainment for higher ability students. This suggestion is consistent with 
the present findings since Finlayson (1972) has shown that level of expressed * achieve- 
ment motivation ' may be affected by success and failure experience and experience 
of success is more likely to accrue for the academically more able student. 


It would seem that there is strong support for the method of identification of 
pupil coping strategies used in this study, but more research in this area would 
seem to be indicated. The evidence presented here clearly demonstrates that for the 
purpose of identification of pupils who may be ‘ at risk ', measures of trait or state 
anxiety alone are insufficient; some indication of directionality such as the * achieve- 
ment motivation ' scale used in this study is also required. The evidence also supports 
the suggestion made by Lazarus (1966) that we should define stress in terms of 
transactions between individuals and situations. It also supports Entwistle's (1974) 
contention that theories of personality need to be adapted to fit education environ- 
ments. The use of scales which bear close conceptual links with the classroom situation 
would seem to be vindicated. 


From a methodological standpoint the findings in this study bear witness to the 
importance of: 


{у using methods of analysis that permit the detection of effects; 
(b) undertaking analyses separately by sex using the within sex distributions; 
(c) adequate specification of settings. 
„The findings relating to extraversion are of great interest. It may be that 
avoidance behaviour may take different forms and that these forms may be related 
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to the extraversion dimension. This point will be discussed further in the second 
paper in which an investigation into the relationship between the inferred coping 
strategies and pupil classroom behaviour is reported. 
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HIGHLY ANXIOUS PUPILS IN FORMAL AND INFORMAL 

PRIMARY CLASSROOMS; THE RELATIONSHIP BETWEEN 

INFERRED COPING STRATEGIES AND: II—CLASSROOM 
BEHAVIOUR 


By BARBARA E. WADE 
(Department of Social Administration, The London School of Economics) 


SUMMARY. This study was carried out in conjunction with the Teaching Styles project 
(Bennett, 1976). A behaviour observation schedule was devised and utilised to investi- 
gate pupil classroom behaviour in relation to differing levels of anxiety and achievement 
motivation. The observation sample comprised 104 10- and 11-year-old pupils in 
formal and informal classrooms. Based on the premise that self-report measures of 
anxiety and achievement motivation may be indicative of coping strategies, zonal 
analyses were carried out separately by sex and within teaching style. Results, which 
varied slightly according to teaching style, indicated that an inferred coping strategy of 
© approach ' (high anxiety, high motivation) was associated with a higher frequency of 
work activity whereas an inferred coping strategy of ‘ avoidance’ (high anxiety, low 
motivation) was associated with a higher frequency of social interaction, negative 
behaviour, watching other pupils, gazing into space, fidgeting and moving around the 
classroom. These results were much less marked for girls than for boys. Information 
on the home backgrounds of high anxious, low motivated pupils suggested a higher 
incidence of home problems. The parents of high anxious, high motivated pupils were 
frequently described by teachers as ‘pushing’ or ‘over keen’. The educational 
implications of these findings, together with the previously reported findings relating 
the inferred coping strategies to cognitive attainment, are di 


INTRODUCTION 


IN the first paper (Wade, 1981) it was reported that for a sample of 956 10- and 
11-year-old pupils under differing teaching styles, an inferred coping strategy of 
approach (high anxiety, high motivation) was associated with higher attainment than 
an inferred coping strategy of avoidance (high anxiety, low motivation). Even when 
a stringent indicator of ability (a pre-test measure of cognitive attainment) was 
included in analysis this association was still apparent for all low ability boys. At 
the upper ability level significant discrepancies in attainment were shown only for 
boys in formal classrooms where an inferred coping strategy of ‘approach’ was 
found to be associated with the highest level of attainment and an inferred coping 
strategy of ‘avoidance’ was associated with the lowest level of attainment. For 
girls the discrepancies in attainment between ‘ approachers ’ and ‘ avoiders ' largely 
disappeared when ability was taken into account. These findings were interpreted 
in terms of sex-related conformity for girls and in terms of extrinsic motivational 
incentives, curriculum structure and possible parental influence for boys. A further 
finding indicated that ‘ avoiders ' showed a strong tendency to be introverted whereas 
*approachers' were predominantly extraverted in personality. This finding was 
interpreted in the light of Gray's (1970, 1971) postulate that introverts are relatively 
more susceptible to punishment/non-reward, whereas extraverts are relatively more 
susceptible to reward. In this second paper an investigation into the classroom 
behaviour ої а 10 per cent subsample of these same pupils is reported. 


The underlying rationale for the inferred coping strategies was described in 
detail in the first paper. Briefly it derives from: 


(a) recognition of the anomaly suggested by a high score on a measure of 
anxiety together with denial of the need to achieve; 
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(b) а suggestion made by Finlayson (1972) that high drive boys experience 
conflict when expectations of success are not confirmed and that this results 
in denial of motivation on a self-report measure. 


The Atkinson (1964) formulation implies that there is an approach-avoidance 
conflict for any individual in the performance of a task, and Mandler and Sarason 
(1952) suggest that coping strategies in response to anxiety may manifest in avoidance 
behaviour with attempts to leave the task situation, or in approach behaviour which 
leads to the reduction of anxiety through task completion. Accordingly, it is 
hypothesised that * approachers ° (HAHM) will apply themselves more diligently in 
the classroom situation than © avoiders’ (HALM) who might be expected to seize 

every opportunity of evading involvement in school work. In practical terms this 
hypothesis implies that pupil observation should reveal a higher frequency of work 
involvement by ‘approachers’ than by *avoiders', concomitant with a lower 
frequency of non-work related activities such as: social interaction with peers, 
watching other pupils, gazing into space or out of the classroom window, fidgeting, 
or wandering around the classroom. А higher frequency of negative behaviour might 
also be expected оп the part of © avoiders’. Since by definition (Bennett and Jordan, 
1975), * formal" teachers favour an individual approach to work, less work-related 
interaction should be evidenced by ' approachers’ than by ‘avoiders’ in conformity 
with а formal teaching regime. Conversely, in the ‘informal’ situation, where 
teachers favour group work, * approachers' should reveal a higher level of work- 
related interaction than ‘avoiders’. This study investigates these aspects of pupil 
classroom behaviour in relation to the inferred coping strategies under formal and 
informal teaching styles. 


DESIGN OF THE INVESTIGATION 

Subjects 

The observation sample was selected on the basis of a pupil typology created 
in connection with the Teaching Styles project (Bennett, 1976). "The typology was 
based on cluster analysis of the responses of a sample of 956 pupils to a pre-test 
battery of affective measures. А list of 160 children, belonging to the eight clusters 
defining the typology, was compiled for the observer who was unaware of pupil 
cluster membership. The 160 children were pupils in the 12 most " formal ' and the 
13 most ‘ informal’ of the 37 classes selected for the Teaching Styles project. Sub- 
sequently three classes were lost for observational purposes due to absence or replace- 
ment of the teacher concerned. Some loss of subjects also occurred because of pupil 
absenteeism. Ultimately a total of 104 pupils was observed over a period of approxi- 
mately eight weeks towards the end of the summer term, 1974. Fifty-six of these 
pupils were in ‘ informal’ classrooms and 48 were in " formal’ classrooms. 


The observation schedule 

The Pupil Observation Schedule (PBS) was developed by the author for use within 
the Teaching Styles project. The aims of the research to be conducted using the 
PBS were: 


(1) To investigate pupil classroom behaviour in relation to gross differences in 
teaching style. 

(2) To study the relationship between teaching style and the behaviour of pupils 
of differing personality. 

(3) To investigate the relationship between the inferred coping strategies and 
classroom behaviour. 


The results of the first two investigations are reported in Bennett (1976). The develop- 
ment of the PBS is described in detail elsewhere (Wade, 1979). The instrument 
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focuses on behaviours which require little inference on the part of the observer. 
The following categories are included: 
(1) Work Activity—writing, mathematics, making, reading and miscellaneous 
activities. 
(2) Pupil-pupil interaction: 
(8 work-related—asking, responding and cooperating; 
b) social—attracting attention, playing or chatting, negative behaviour. 
(3) Pupil-teacher interaction. 
(4) Negative behaviour—arguing, pushing or snatching and like behaviours. 
(5) Watching the teacher. 
(8 Watching other pupils. 
7) Avoidance/daydreaming, gazing out of the classroom window. 
6) Wandering around the classroom. 
9) Fidgeting. 
(10) Preparing and clearing away materials. 
(11) Waiting to see the teacher. 


Unfortunately, due to limited time and finance, it was not possible to train 
other observers in the use of the schedule prior to its use within the Teaching Styles 
project, but a coefficient of inter-observer reliability (Scott, 1955) in excess of 0-9 
was subsequently obtained by two observers after approximately 13 hours of training. 


Procedure 

One or two days were spent observing in each classroom, depending on the 
number of target pupils. А quiet vantage point was chosen and a time sampling 
procedure was followed in which pupils were observed in rotation for five-minute 
periods, behaviour being recorded at 15-second intervals. All categorised behaviours 
were recorded. For example, if a pupil was observed fidgeting whilst writing both 
these behaviours were recorded. This procedure enabled the collection of more 
information within a limited period and did not require the observer to decide which 
item of behaviour was the more important. At the end of each observation session 
teachers were asked a non-directive open-ended question about the home circum- 
stances of target pupils. 


RESULTS 


In view of the limited size of the observation sample and the need to carry out 
analyses separately by sex and within teaching style whilst retaining adequate group 
size for comparative purposes, it was decided that median splits should be made on 
the post-test ‘ anxiety’ and ‘achievement motivation’ scales. The same cut-off 
points were used as for the attainment comparisons reported in the previous paper 
(Wade, 1981), but the limited size of the observation sample precluded comparisons 
of behaviour within ability level. 


Zonal analyses were carried out. Mean frequency counts of the behaviour of 
‘approachers’ (HAHM) and 'avoiders' (HALM) for the major categories are 
given separately by sex and within teaching style in Tables 1 and 2. Mann-Whitney 
U tests were carried out to assess the significance of differences in behaviour between 
the two groups. 


Boys in formal classrooms 

Differences in behaviour in line with the predictions are apparent for these 
pupils (Table 1). Although the discrepancy between the two groups is not statistically 
significant, more work activity was evidenced by ‘ approachers ' than by ‘ avoiders °. 
In conformity with the formal situation significantly fewer work-related exchanges 
were observed with regard to ‘ approachers ' than ' avoiders °. " Avoiders ' were also 
much more frequently seen engaging in social conversation or watching other pupils. 
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TABLE 1 
THE OBSERVED BEHAVIOUR OF ‘APPROACHERS’ (HAHM) AND ‘Avomers’ (HALM) IN 
FORMAL OOMS 
Mean Frequency 
Boys Girls 
Anxiety High High Mann-Whitney High High Mann-Whitney 
Motivation High Low U test High Low U test 
М8 4 (1-tailed) il 4 (I-tailed) 
Work 
Activity 108-1 95-8 NS 105-0 89-3 NS 
Work-related 
Interaction 13:5 26:3 P«0-01 25:6 25:5 NS 
Social 
Interaction 7-0 153 P<0-05 10:7 20-5 NS 
Teacher 
Interaction 13:0 6:5 345 94 83 NS 
Negative 
Behaviour 0-8 25 NS 12 28 NS 
Watching 
Teacher 400 300 NS 343 39-8 NS 
Watching 
Pupils 16:5 32:5 Р < 0-01 24-7 29:5 145 
Avoidance 3:8 53 NS 47 5-0 NS 
Classroom 
Movement 65 13-5 NS 6:3 103 NS 
Fidget 41-0 56-0 NS 45-6 56-0 NS 
tion 9-3 120 NS 87 16-0 NS 
Waiting 2:5 10 NS 1:9 43 NS 
TABLE 2 
THE OBSERVED BEHAVIOUR OF ‘APPROACHERS’ (HAHM) AND ‘Ауоревѕ’ (HALM) IN INFORMAL 
CLASSROOMS, 
Mean Frequency 
Boys Girls 
Anxiety High High Mann-Whitney High High Mann-Whitney 
Motivation High Low U test High Low U test 
N 8 7 (l-tailed) 10 5 (-tailed) 
Work 
Activity 94-1 69-6 P«005 96-2 82-0 NS 
Work-related 
Interaction 17-9 19-9 NS 36:3 23:8 P«005 
Social 
Interaction 110 25:7 P<0-05 17:5 18-0 № 
Теасфег 
Interaction 9-1 179 NS 10-7 5-0 NS 
Negative 
Behaviour 2:0 54 Р<0:05 27 06 P<0-05 
Watching 
Teacher 25-3 25-1 NS 24-6 25-8 NS 
Watching 
Pupils 254 251 NS 25-8 35-4 №5 
Avoidance 24 84 P<0-01 34 42 NS 
Classroom 
Movement 94 143 №5 10-7 18-4 NS 
Fidget 29-9 52:6 P<001 38:4 36.6 NS 
tion 8-5 126 NS 7-7 6-4 NS 
Waiting 25 43 NS 28 48 NS 
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They also tended to daydream, fidget and wander around the classroom to a greater 
extent than their more motivated peers. 


Girls in formal classrooms 

Scrutiny of Table 1 reveals that many of the observed differences in behaviour 
between ' approachers ° and ‘ avoiders’ are in the predicted direction although they 
are not statistically significant. * Approachers? were more frequently observed on 
task whereas © avoiders ° were more frequently seen waiting to see the teacher ог getting 
materials. ‘ Approachers ' spent less time chatting, arguing, fidgeting or wandering 
around the classroom than * avoiders'. Unlike the boys, there was no difference in 
the observed frequency of work-related interaction. 


Boys in informal classrooms 

Table 2 reveals that considerable differences in behaviour, in line with the 
predictions, are apparent for these pupils. Significantly more work activity was 
observed with regard to * approachers’ than ‘ avoiders’ who were more often seen 
chatting to peers about matters unrelated to school work, arguing or pushing, day- 
dreaming and fidgeting. Avoiders tended to wander around the classroom to a 
greater extent than ‘ approachers ’ and appeared to spend more time getting materials 
or waiting to see the teacher, with whom they had more contact. Little difference 
can be seen in the extent to which ‘ approachers’ and ‘ avoiders’ co-operated or 
discussed their work. 


Girls in informal classrooms 
Behavioural differences are also largely in the predicted direction for these pupils. 
A higher level of work activity was evidenced by * approachers’ than by ‘ avoiders ° 
although the discrepancy between the two groups does not reach the 0:05 level of 
significance. There was little difference in the extent to which ‘ approachers ' and 
* avoiders ' indulged 1 in social exchanges, but, in conformity with the informal situation 
* approachers ' appeared to spend more time co-operating or discussing matters 
relating to work. " Avoiders’ tended to watch other pupils more and move around 
the classroom to a greater extent than their more motivated counterparts. On the 
other hand “© approachers’ were more often seen in disagreement with their peers. 
Since such behaviour was comparatively infrequent it may be that this category is 
not very reliable. 


DISCUSSION 


The finding that highly * motivated’ low " anxious’ pupils tend in all cases to 
show greater persistence in work activities than low * motivated’ highly ‘ anxious’ 
pupils is complementary to the main sample finding of significant differences in 
attainment between these two groups in both types of classroom (Wade, 1981). 


The finding of significant differences in behaviour, in line with the predictions, 
between the two high anxiety groups would seem to lend further support to the 
proposition that measures of " anxiety * and " achievement motivation ° reflect coping 
strategies, but it would also seem that for girls these strategies are less relevant in the 
classroom situation than for boys. Not only were the previously reported differences 
in attainment less extensive for girls than for boys, but when ability was taken into 
account discrepancies in level of attainment were extremely small. Comparisons of 
behaviour also show only minor differences. 


А much clearer pattern emerges for boys. Not only were there discrepancies in 
level of attainment for all lower ability boys and for upper ability boys under a formal 
teaching style, but concomitant behavioural differences between * approachers ' and 

* avoiders' are also evident. In both formal and informal classrooms ‘ avoiders' 
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were observed to spend more time in casual conversation, indulge in more negative 
behaviour, move around the classroom to a greater extent and fidget more. In formal 
classrooms ‘ avoiders ’ tended to have rather less contact with the teacher than other 
pupils, but in informal classrooms they tended to have more. In formal classrooms, 
contrary to the norm ‘ avoiders ' spent significantly more time discussing work with 
peers, seeking help perhaps? 


Limited questioning of the teacher as to the home backgrounds of children 
would seem to shed further light on differences between ‘ approachers ’ and ‘ avoiders ’. 
Of the 20 * avoiders ' who were observed information was available for 16 children. 
Of these 16, six came from broken homes, five others came from families who were 
coping with other problems and one had recently been transferred from a special 
school. Three of the boys had been caught stealing in school and one of these had 
been referred to a psychiatrist. Two of the children were said to come from ‘ happy ' 
homes. 


Of the 37 * approachers ' who were observed, information was forthcoming for 
26 children. Of these 12 were said to have ‘ happy ' home backgrounds. Four children 
were said to have some sort of problem at home. Of prime interest with regard to 
this group of children is that the parents of nine of them were described by teachers 
as being ‘ over keen’, ‘ pushing’ or ‘ dogmatic’. 

Atkinson (1964) suggests that high performance, when the tendency to approach 
success is equal to the tendency to avoid failure (НАНМ or LALM), can be attributed 
to the influence of extrinsic motivational incentives. In interpretation of the main 
sample findings (reported in the first of these two papers) it was suggested that 
extrinsic incentives stemming from outside the classroom, such as parental " spurring ’, 
may also mediate between ability and attainment. The information gleaned from 
teachers with regard to the home backgrounds of the pupils comprising the observation 
sample lends considerable support to this suggestion, for it would seem that the 
parents of high achieving, anxious, motivated boys are much more ‘ pushing’ with 
regard to their children’s schoolwork than other parents. It might be expected in 
terms of sex role and parental ambitions that this would be of more importance for 
boys than for girls. Perhaps parents are more ‘ pushing’ in respect of their sons 
than of their daughters and this results in anxious striving on the part of their male 
offspring. 

Judging by the behaviour and performance of ‘ avoiders’ it would also seem 
that boys may be more susceptible to emotional disturbance when home circumstances 
are unhappy. If boys are indeed less conformist in terms of social role in the class- 
room then home circumstances may have a greater influence on classroom behaviour 
and performance than they do for girls. 


The main sample findings also showed that the mean extraversion scores for 
HAHM pupils (approachers) were considerably higher than those of HALM pupils 
(avoiders). This was the case for both sexes under all teaching styles. These findings 
were interpreted in the light of Gray’s (1970, 1971) theory of extraversion and the 
fear-frustration hypothesis which suggest that extraverts are relatively more susceptible 
to reward whereas introverts are more susceptible to punishment/non-reward. 
However it was also apparent, from the size of the standard deviations, that whereas 
* approachers ’ were generally extraverted in personality there was a wider spread of 
scores on this dimension for ‘avoiders’. It would seem possible therefore that 
avoidance behaviour might take different forms. 


Eysenck and Rachman (1965) suggest that extraverted, neurotic children generally 
display antisocial behaviour whereas introverted, neurotic children tend to be sensitive, 
depressed, absent-minded, inefficient and are given to daydreaming. Lunzer (1960) 
also describes two basic manifestations of maladjustment in primary schoolchildren, 
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namely aggression and withdrawal. In the present study, observation of classroom 
behaviour has revealed that 'avoiders" indulged in more chatting about matters 
which were not related to their school work, more fidgeting, more daydreaming or 
watching other pupils, more negative behaviour and less work activity than "ар- 
proachers'. ЈЕ would seem that both types of behaviour were in evidence for 
* avoiders °, especially for boys in informal classrooms where, by definition, teachers 
placed fewer constraints on the behaviour of pupils. 


In the absence of replication the findings reported in these two papers must 
necessarily be regarded with caution. The implications of the findings are nevertheless 
important for, given the assumption that self-report measures of ‘ anxiety’ and 
* achievement motivation ' are indicative of coping strategy, these results appear to 
indicate that teaching methods have an important bearing on the outcome of the 
different strategies. It would appear that ‘ approachers ' attain more under a highly 
structured, evaluative formal teaching regime. However, such a regime may пої 
be so conducive to high attainment for their equally anxious peers who express denial 
of the wish to achieve and who, by virtue of their behaviour and attainment levels, 
may be denied access to reward and are thereby caught in a vicious circle. It would 
seem that these children need reassurance together with the provision of clearly 
delineated tasks, carefully geared to ability level, to minimise the possibilities for 
avoidance activities. Continued experience of success will be necessary for these 
children so that they may be encouraged to adopt a more appropriate coping strategy. 


The training of teachers in the judicious use of praise and encouragement has 
been shown to be effective in the control of deviant behaviour in studies reported by 
Becker et al. (1967) and Ward (1972). Where home problems exist, parent-teacher 
communication would seem vital if the teacher is to help to provide a degree of 
security that may be temporarily absent in the home situation. In short, what is 
advocated is not adherence to any one teaching method but the employment by 
teachers of a flexible approach, encompassing different teaching strategies that will 
take account of variations in the home backgrounds of pupils together with variations 
in ability level, anxiety and expressed motivation. Whilst this may seem an unrealistic 
aim where classes are large, the problem might be resolved to some extent by the use 
of team teaching methods which could be geared to the needs of children rather than 
to subject or teacher specialisms. If the polarities in pupil performance that are 
unrelated to ability are to be avoided then it would seem more profitable to discuss 
ways in which the specific needs of children might be met than to discuss the relative 
efficacy of differing teaching styles. 

Substantiation in great measure of the hypotheses made in relation to cognitive 
and behavioural outcomes in association with the inferred coping strategies points 
up the need for a re-evaluation of research into the relationship between personality 
and attainment. Acceptance of the proposition that measures of ‘ anxiety’ and 
* achievement motivation" are indicative of coping strategies would necessitate a 
theoretical shift in our approach to the investigation of individual differences which 
merits careful consideration. 

Finally, with regard to the observation of pupil classroom behaviour, the results 
of this foray bear testimony to the value of this approach as a means of: 

а) monitoring the effects of differing teaching strategies; 
n adding to our understanding of the dynamics of the classroom situation; 
(c) relating pupil attitudinal data to actual behaviour. 


REFERENCES 


ATKINSON, J. W. (1964). An Introduction to Motivation. New York: D. Van Nostrand. 
Becker, W, C., MADSEN, C. H., ARNOLD, C. R., and THoMas, D. В. (1967). The contingent use of 
teacher attention and praisein reducing classroom behaviour problems. J. spec. Educ.,1, 287-307. 


В. Е. WADE 57 


BENNEIT, S. М. (1976). Teaching Styles and Pupil Progress. London: Open Books. 
Bennett, S. N., and JORDAN, J. (1975). А typology of teaching styles in primary schools. Br. J. 

educ, Psychol., 45, 20-28. 

EvsENCK, Н, J., and RACHMAN, S. (1965). The Causes and Cures of Neurosis. London: Routledge. 
FINLAYSON, D. 5. (1972). Expressed evement motivation in relation to the achievement motive, 

neuroticism and ool success. Br. J. educ. Psychol., 42, 65-70. 

Gray, $ ria The psychophysiological basis of introversion-extraversion. Behav. Res. Ther., 
Gray, J. А. (1971). The Psychology of Fear and Stress. London: Weidenfeld. 
LuNzER, E. А. (1960). Aggressive and withdrawing children in the normal school. Br. J. educ. 

A. 30, 119-123. 

й Маны, сов , and SARASON, S. В. (1952). A study of anxiety and learning. J. Abnorm. soc. Psychol., 
зо, W, fie disp. Reliability of content analysis: the case of nominal coding. Public Opin. О. 
WADE, В. E. (1979). Anxiety and achievement motivation in relation to the cognitive attainment and 

previous of pupils in formal and informal classrooms. Unpublished doctoral thesis, University 

о caster. 

WADE, B. Е. (1981). Highly anxious pupils in formal and informal primary classrooms; the relation- 

FW oo inferred coping strategies and: I.—Cognitive attainment. Br. J. educ. Psychol., 

1 
WARD, J. (1972). Modification of deviant classroom behaviour. Br. J. educ. Psychol., 42, 304-313. 


(Manuscript received 22nd April, 1980) 


Br. J. educ. Psychol., 51, 58-65, 1981 


THE PRODUCTION AND PERCEPTION BY PROFOUNDLY 
DEAF CHILDREN OF SYNTACTIC TIME CUES IN 
ENGLISH 


Bv G. P. IVIMEY 
(University of London Institute of Education) 


Summary. Earlier analyses of the syntactic development of English profoundly deaf 
children have revealed the existence of non-standard stages in the evolution of verb 
phrases. These pass from a stage of unit verbs where form of the verb is constant, 
usually present or past, but with no regular time-reference; through an intermediate 
stage of dual time reference, frequently as with the children studies contrasting future 
and a common present/past system. In this latter, form still has no systematic time- 
marking function. Finally children develop a three-fold time marking system in which 
form and reference of verbs are fairly consistently related. It was predicted that the 
stage of development reached in verb-phrase production would influence the perception 
of verb phrases in language reception. A reading task, using vocabulary known to the 
subjects, was used, requiring the children to recognise the time-reference of simple 
sentences. The prediction was supported: the productive linguistic model of the 
children appears to be related to their perception of language. 


INTRODUCTION 


А CENTRAL element in contemporary cognitive psychology is that perception occurs 
very largely in terms of the cognitive models that the perceiver applies to the pheno- 
mena being perceived. Neisser has argued that “ whatever we know about reality 
has been mediated, not only by the organs of sense but by complex systems which 
interpret and reinterpret sensory information " (Neisser, 1967, p. 3). Neisser was, 
here, speaking in general terms of the processes of human perception. Other workers 
have been more specific. Thus Liberman, in an extensive review of the experimental 
literature in the perception of speech, writes: “ We find it plausible to assume that 
speech is perceived by processes that are also involved in its production. The most 
general and most obvious motivation for such a view is that the perceiver is also a 
speaker and must be supposed, therefore, to possess all the mechanisms for putting 
language through the successive coding operations that result eventually in the 
acoustic signal " (Liberman et al, 1967). Stevens makes a similar point: “ The 
auditory processing of speech bears an especially close relation to the cognitive and 
motor ability involved in the generation of speech” (Stevens and House, 1972). 
Smith has reached a similar conclusion with regard to the perceptual processes 
involved in reading: “I shall propose that the actual marks on a printed page are 
relatively less important than the knowledge of language that a skilled reader has 
before he even opens the book. And the description of the visual process will imply 
that the information that passes from the brain to the eye is more important in 
"go than the information that passes from the eye to the brain " (Smith, 1971, 
p. 9). 

A number of workers have demonstrated this point experimentally. Bever 
(1973) has shown that when sentences occur against a background of noise, those 
that are probable (e.g., ^ The mother patted the dog ”) are perceived readily while 
those that are improbable (" The dog patted the mother ”) are not perceived. Going 
further, it was shown that where noise levels are set such that a sentence like “ The 
mother patted the dog " is just perceptible, the individual words that constitute the 
sentence but uttered in isolation drop below the threshold of perception: they become 
inaudible. Bever concludes: " Somehow the bits and pieces of words we hear can 
contribute mutual information when they are placed in a sentence ”, that is, perception 
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occurs аз sentence organisation appears. This must surely apply also to the work of 
Miller and his colleagues in the perception of elliptical and filtered speech against a 
background of white noise: “ When all the frequencies above 1,000 cps were removed 
the ellipsis could no longer be detected. Elliptic speech sounded just the same as 
normal speech under these conditions of distortion " (Miller and Nicely, 1955). In 
addition these distortions and filterings did not result in reductions of comprehension. 
Ivimey concluded, in a brief review of the literature, that “ what we perceive is not 
what the senses receive, but the end product of a complex of cognitive transformational 
processes " (Ivimey, 19776, p. 70). It is clear that these findings will have great 
importance in the perception of speech by subjects whose cognitive processes are 
unable to perform the complex transformation necessary for the disambiguation of 
speech inputs. In so far as their linguistic competences (as compared with those of 
normal subjects) are delayed or disturbed, it is likely that they will have increased 
difficulty in language directed to them. One such group, clearly, consists of children 
with severe auditory disability. 


Such children have been shown repeatedly to have not only " poorly developed 
grammatical abilities (but) also exhibit restricted, repetitive modes of expression and 
limited vocabulary " (Moores, 1970). Brannon has shown that “ the deaf seem to 
use only a small number of different expanding words such as auxiliaries and con- 
nectives " along with overuse of nouns and articles (Brannon, 1968). Other workers, 
including Myklebust (1960), Presnell (1973) and many others have shown that the 
area of greater deviance is located in the verbal (including auxiliaries) component, 
although some writers go further. Thus Fusfeld suggests that much of the language 
production of the deaf consists of more or less random concatenations of words: 
their writing “‘is a tangled web tyre of expression in which words occur in profusion 
but do not align themselves in an orderly way” (Fusfeld, 1955). Blanton et al. 
(1967) assert that the deaf are incapable of forming linguistic concepts. 


These latter views are undoubtedly extreme, but there is general agreement among 
research workers that the productive language skills of profoundly deaf children 
display markedly deviant features. In so far as this productive language reflects 
underlying cognitive skills and processes it is likely that language perception by 
profoundly deaf subjects will also be influenced. 


The productive syntax of profoundly deaf children 

though most writers in the area are in general agreement, few if any attempts 
have been made to describe the systemic nature of the linguistic competence of deaf 
children. Errors are reported but no attempt is made to demonstrate whether these 
errors are random (which would be strong evidence for the position adopted by 
Fusfeld, Blanton et al.) or structured. However, a number of detailed studies of the 
productive syntax of profoundly deaf children have been carried out in London 
(Lachterman, 1975; Ivimey, 1978; Ivimey and Lachterman, in press) using the 
controlled elicitation language sampling technique developed by Ivimey (1976). The 
major feature of this technique is that the exact reference of each utterance obtained 
is known without ambiguity. This exactness covers not only general semantic features 
but also time, aspect, number and so on. These studies have shown that the children 
studied make use of very deviant syntactic systems, especially in the structure of verb 
phrases. A large majority (80 per cent) of the 10-year-old profoundly deaf children 
studied used unit verbs and external markers. A unit verb is a word with verbal 
functions whose form has no regular relationship with time reference. Such a verb 
may appear in past or present form with past, present or future reference. Actual 
time reference, where it occurs, is made by an external marker, most commonly an 
adverb of time. For each child the form adopted by a verb usually remains constant 
through all changes of time reference, although the form may differ between children: 
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M.S. The man smack his face (now)* 
Everyday the man smack his face. 
Tomorrow he smack his face. 
L.T. Тһе two girls kicked his boy (now) 
The two girls kicked the boy (yesterday) 
Tomorrow Peter and Jane watched the television. 
Jane and Peter watched the television (tense ref. = now) 
Yesterday Jane and Peter watched the television. 


Similar unit-verb usage was also found in 60 per cent of 12-13 year-old profoundly 
deaf children, and at age 15-16 years it is used by a large minority, approximately 
30 per cent (Ivimey, in preparation) of the children studied. 


Ivimey (1978) has shown that the development from a mainly or wholly unit 
verb system to one in which time-reference is indicated within the verb phrase passes 
through а number of stages: 

Stage 1. Use of unit verbs in all or а majority of cases. The usually rare examples 
of morphological change in verbs appear to be in free-variation and have no systematic 
time reference. A similar example, in the case of a noun, is seen in the examples 
of M.S. above. Man and men are for him in free variation: the stimulus included a 
single man hitting a boy. Ivimey has characterised such rare and unsystematic 
changes as pseudo-morphological forms. 

Intermediate stage. Development of regular and systematic future time marking. 
At this stage present and past reference is confused and unit verbs appear to be 
retained for these. This stage is characterised by а dual system: future vs. common 
(past/present) marking. 

Stage 2. Appearance of a developed three-fold time-marking system in which 
present, past and future time reference is achieved, although often with non-standard 
English features. At this stage however there is no detectable distinction between 
different aspects (e.g., continuous, habitual or single actions): 

S.S. (а) Man smack boy’s face (today) 

Tomorrow father will be smack the boy’s face. 

Father did smack the boy’s face (yesterday) 

(b) The dog bite the postman’s leg. 

Tomorrow the dog will bite the postman’s leg. 

Yesterday the dog was bited the postman’s leg. 
Of the 20 12-13 year-old children whose productive languege was studied by Ivimey 
(1978), three had reached Stage 2, four were in the Intermediate Stage and 12 were 
still in Stage 1, using mainly or wholly unit verbs. One child had developed a fairly 
regular past tense but retained a common unit-verb structure for future and present 
tense reference. 


The perception of time-cues by deaf children 
Applying the central tenet of cognitive psychology discussed above and succinctly 
summarised by Smith (1971) that in reading the brain contributes more than the eye, 
it was predicted that, in a simple reading task in which temporal adverbs were absent 
(i.e., опе in which, in order to interpret time reference subjects were forced to utilise 
purely morphological evidence encoded in verb-phrases): 
(а) those children who used a mainly ог wholly unit-verb system would be 
unable to differentiate time differences between sentences containing different 
verb tenses; 


* Where a sentence does not contain an overt time marker, the reference of the sentence as 
volunteered by the child is given in parenthesis. 
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(b) those children who had developed a regular three-fold time-marking system 
in their productive language would perceive present, past and future time- 
marking in written sentences; 

(c) those children who had developed a consistent future-marking system in 
production would perceive accurately future time reference but would 
confuse present and past tenses. 


Perception of time-reference is defined operationally as the ability to allocate an 
appropriate adverb of time to a randomised series of sentences with different time- 
reference. 


Design of the experiment 

Experimental group: 19 out of the 20 children whose syntax was studied by 
Ivimey (1978) were selected for testing. One child was omitted since in his dual 
time-marking system he had developed a fairly regular past-marking system that 
contrasted with a common future/present system. For the group as a whole the 
mean hearing loss in the better ear for the speech-frequencies (250-2,000 Hz) was 
95dB and the mean IQ (based on the Hiskey Nebraska Test of Learning Abilities) 
was in the low 905. АП the children had spent their school lives at the same residential 
school for severely and profoundly deaf children in the south of England. Each 
child had a reading age of 7 years. Mean chronological age was 12 years 8 months. 


Control group: The test was administered also to a control group of 30 normally 
hearing 7-8 year-old children of average ability and with reading ages of 7 years, 
attending a day school on the eastern outskirts of London. 


The experimental task: A number of short sentences using only simple vocabulary 
known to the deaf subjects and including many of the words and sentences used in 
the earlier syntactic analysis was prepared. Verb phrase forms included: 


Present continuous 5 (includes 2 possible “ near futures ”) 
Present habitual 
Future continuous 
Future simple 
Past with present reference 
Past simple 
Past continuous 
(А complete list of sentences is given in the Appendix.) 


These forms were randomised and printed in booklet form with six sentences on 
each left-hand page. Opposite each sentence was a line on which an appropriate 
time word had to be written. Time words were supplied as follows: 
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Present Reference Future Reference Past Reference 
now soon yesterday 
today tomorrow last week 
everyday next week before 


These words were used frequently in the original elicitation by the deaf subjects 
either in verbal form or through manual signs. Amongst these children the terms 
now and today were translated by an identical sign (the forefinger pointing downwards 
at roughly chest height) and presumably have identical meaning. Everyday for the 
deaf children had present reference. Soon is a mark of general futurity and before 
of undifferentiated pastness. Identical words were supplied to the hearing control 
group. 

The experimental task consisted of reading each sentence, deciding upon its 
time-reference and selecting an appropriate time word to be written opposite the 
sentence. In each case.the children's class teachers explained the task before the 
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testing began. They were instructed to give any and as many as examples as they 
wished but not to use the actual test sentences. "They were asked not to let the children 
begin until they were sure the task was fully understood. 


RESULTS 


In Table 1 the time allocations appear to be scattered at random across the 
sentences: an English future tense is more likely to be classified as past or present 
than future, and a past one as future or present than past, etc. The observed scores 
were examined using the chi-squared statistic under the null hypothesis (Ho) that 
allocations were distributed randomly. x? = 4-047 with 4 df giving a non-significant 
result, not permitting us to reject Ho. It is appears that children who use wholly or 
mainly unit-verbs in their language production are unable correctly to perceive the 
time-reference encoded in normal English verb-phrases. 

In Table 2 the allocations of time-words are beginning to appear less random: 
the majority of future tenses are correctly allocated to future time and a slightly 
greater proportion of past tenses appear to be correctly recognised. The x2 value 
for Table 2 is 12-693 with 4 df giving a significant value (0-05 < P —0-02). Allocations 
of present and past tense verb phrases are, however, still random. Children who 
have developed a regular future marking system but do not distinguish between 
present and past time reference in their productive language are able correctly to 
distinguish future time marking but not present and past in lenguage directed to them. 
Although there are still some mis-allocations in Table 3 it is clear that a higher level 
of correct recognition of time in the morphology of verb-phrases has been reached. 
The x? value for Table 3 is 56:064 with 4 df, enabling us to reject Ho (P « 0-001). 
Children who have developed a three-fold distinction of time-marking in their рго- 
ductive language are able to recognise with a high degree of accuracy past, present 
and future time in normal English verb phrases: 100 per cent of English future tenses 
are correctly recognised, as are 65 per cent of past and 60 per cent of present tenses. 


TABLE 1 


CATEGORISATIONS OF VERB-PHRASES BY UNII-VERB 
USERS (raw scores) 











Categorisation 
Present Future Past 
Present 13 14 9 
Time-Reference | Future 45 44 64 
Past 13 23 9 
TABLE 2 


CATEGORISATIONS OF VERB-PHRASES BY CHILDREN IN 
INTERMEDIATE STAGE OF DEVELOPMENT (dual system: 
future vs. common) (raw scores) 








Categorisation 
Present Future Past 
Present 7 4 5 
Time-Reference | Future 3 16 1 
Past 26 13 29 
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TABLE 3 


CATEGORISATIONS OF VERB-PHRASES BY CHILDREN 
WITH ‘THREE-FOLD ‘TIME MARKING SYSTEM IN 
PRODUCTIVE GRAMMAR (raw scores) 








Categorisation 
Present Future Past 
Present 6 4 2 
Time-Reference | Future 0 15 0 
Past 16 3 32 


TABLE 4 


CATEGORISATIONS OF VERB-PHRASES BY NORMALLY- 
CoNTROL GROUP 


Categorisation 
Present Future Past 
Present 119 54 20 


Time-Reference | Future 44 88 2 
Past 187 14 257 





In Table 4 are given the categorisations of the normally-hearing control group. 
The xy? value for Table 4 is 163-259 with 4 df, enabling Но to be rejected at well 
beyond the P < 0-0001 level. 


CONCLUSIONS 


The results shown above fully confirm the predictions made in the basis of 
cognitive-psychological theory: there appears to be a very close relationship between 
the rule systems subserving the production of verb phrases by this group of profoundly 
deaf children and their ability to interpret the time reference of written English 
sentences. Those children who do not possess rules for generating output verb 
phrases that are marked systematically for time cannot perceive accurately time- 
marking distinctions while those who do possess these rules also perceive, with a 
high degree of accuracy, the time reference of input sentences. There appear to be 
differences between the categorisations of the most advanced deaf group and the 
hearing controls. However the differences of correct categorisations between the two 
groups just fail to reach statistical significance (х2 = 5-918, with 2 df, 0-10 < P < 0-05). 
In spite of this there are some interesting differences between the deaf and hearing 
responses: | 

(i) The deaf аге more absolute in their categorisations of future tenses. In 
every case the deaf allocate these to definitely future time (tomorrow, 
next week). In contrast, one third of the categorisations of the hearing are 
to the present (today), presumably indicating the existence of a near future 
tense. It seems that the deaf may be operating within a more coarsely 
structured temporal system, paralleling the widely structured spatial system 
revealed in their use of prepositional phrases (Ivimey, 1978). 

(ii) There are similar differences between the deaf and hearing in usage of past 
with present references tenses. For the hearing children these are distributed 
exactly equally between present and past allocations. In contrast the deaf 
allocate 67 per cent to past time and 24 per cent to present. This may also 
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represent a more coarsely structured time system in which past, present and 
future are rigidly structured. Near past and near future seem to have very 
little place in the system used by the deaf. 

(iii) Although the differences in correct allocations of verb phrases of the hearing 
and most advanced deaf groups are not significant this must not be taken as 
evidence for equivalence of more general linguistic behaviour and inferred 
underlying sets of syntactic/morphological rules. As can be seen in the 
quoted examples the markers for time used by the deaf children, although 
related to those of the hearing, are in many respects very different. The 
morphology of verbal phrases produced by deaf children is markedly deviant 
from that of normal English speakers, but there is sufficient overlap between 
the systems for correct interpretation of time reference to take place, if only 
at a rather crude level. 


DISCUSSION 


Earlier work has demonstrated a close relationship between the production and 
perception of speech acts. However, in these investigations, the integrity and fluency 
of underlying linguistic systems has been assumed. Such an assumption is realistic 
in research concerned with linguistically normal adults and children. The deaf 
children studied here, although limited in number, with their deviant syntactic 
systems provide striking evidence of a similarly close relationship in the field of syntax. 
At least some of the “ complex of cognitive transformational processes " referred 
to by Ivimey (1977a, p. 70) can be detected: they include the sets of ordered and 
interrelated language rules that the children have acquired and use to generate 
utterances. 


Although the results examined in this paper apply only to the deaf subjects 
studied it is probable that the findings have wider generality. Frequent reports on 
the language of the ESN show delay in development, with a tendency to overuse 
verbs ending in -ing. Newfield and Schlanger (1968) quote 97 per cent of their 30 
educable mentally retarded subjects (mean mental age 6 years 2 months) as having 
attained this item of English morphology, as compared. with only 26 per cent for 
other features of verbal morphology, while the subjects of Lovell and Bradbury (1966) 
found the same form relatively easy. An unpublished study by the author for the 
language usage of ESN children shows that verbs ending in -ing approximate to the 
unit verbs of the deaf, і.е., they have no systematic time or aspect marking significance. 
Menyuk and Looney (1972) suggest a similar feature in the productive language of 13 
language disordered children (mean age 6 years 2 months) with no additional cognitive 
or intellectual disfunction. 


If the similarities between the productive and perceptive language of deaf children 
demonstrated in this paper have a veridical basis in the hypothesised cognitive/linguistic 
competence of the subjects then there is a high degree of probability that each of 
these groups of children will find similar problems in attempting to interpret language 
inputs. 
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APPENDIX 
The sentences used in the test are: 
Present habitual Present continuous 
The dog bites the postman It is raining 
Т go to school They have a train 
The boys help mother They are coming to my house 
We are good children Mary is going to have an ice cream 
Future simple Future continuous 
It will be foggy The boy will be helping his mother 
He will put the box on the table The man will be reading a book 
The boy will climb the tree 
Past simple Past continuous 
I saw a dog John was talking 
Daddy came home We were having tea 
Mummy gave 20p to Mary It was sunny 
І had an ice cream Mary and Susan were playing 
Daddy smacked the boy 


The boy jumped over the river 


Past with present reference 


The girls have kicked the boy Mummy has shut the window 
I have closed the door I have had an ice cream 
We have given a bone to the dog I have been to London 


John has come home 


Br. J. educ. Psychol., 51, 66-76, 1981 


SELF-CONCEPT AND ATTITUDE TO SCHOOL 


By BEVERLY M. ALBAN METCALFE 
(Human Resources Research Group, Management Centre, University of Bradford) 


SuMMARY. The present research was concerned with three issues: (i) possible changes in 
children's self-concepts as a result of their transferring from primary to secondary 
School; (ii) possible relationship between self-concept and attitude to school; (iii) 
possible differences in the attitudes to school of children with high self-concepts compared 
to those with low self-concepts. No significant differences were detected in the mean 
self-concept scores for the experimental group of boys and the experimental group of 
girls between first and second testings. However, when the children were divided into 
high or low self-concept scorers, both boys and girls in the top quartile for self-concept 
obtained significantly lower scores at second testing. Girls in the bottom quartile at 
first testing obtained significantly higher scores at second testing. Significant differences 
were found in the attitudes to school of children who held high positive self-concepts 
compared to those who held low self-concepts. 


INTRODUCTION 


THE importance of the self-concept to the study of developmental and educational 
psychology is not disputed. The self-concept refers to the collection of attitudes and 
beliefs we hold about ourselves (e.g., Burns, 1977), and is vitally important in the 
child's relationships with his teachers, classmates and others in his school and non- 
school environment (e.g., Thomas, 1973). "There is evidence to suggest that the school 
and non-school environment can have a profound effect on the child's self-concept. 
With reference to the school, Jersild (1952) goes so far as to say that “... it is 
reasonable to assume that for many young people school is second only to the home 
as an institution which determines the growing individual's concept of himself and 
his attitudes of self-acceptance or self-rejection ". Thomas (1973) observed that, 
“ Туре of school, school organisation and teacher-pupil relationship all influence 
self-concept ”. The work of Pedersen (1966) and Zahran (1967) shows that the 
teacher has a significant influence on the pupil’s level of self-concept; he or she can 
depress or elevate it, and can thus affect the pupil's level of aspiration and performance. 
Barker Lunn (1970) has explored the effects of teacher attitudes to streaming and the 
effect on pupils’ self-concepts. She found а complex relationship between self- 
concept, school organisation, and teachers' attitudes to streaming. Davidson and 
Lang (1960) found that children's perceptions of their teachers' feelings towards them 
were significantly positively related to the children's self-images. 


There is now quite a body of research (e.g., Coopersmith, 1959; Piers and Harris, 
1964; Purkey, 1970) reporting positive correlations between self-concept and school 
achievement, though subsequent research has not produced consistent results (e.g., 
Thomas, 1971). It bas been shown that experiences of success and failure affect 
self-attitudes (e.g., Diller, 1954), and Rushton (1966) concluded from his own research 
with 11-year-old English children in their final year at primary school that “. . . the 
less anxious, better adjusted child is most likely to succeed in school work at this age ”. 


Ferri (1971) directed attention to the situation in which the child is taught rather 
than to level of ability of achievement as being a major factor affecting the child's 
self-concept. She found that during a 2-year test-retest period slower learners of both 
sexes had developed favourable self-concepts. This improvement was attributed to 
factors associated with placing the children in a class that suited their particular 
ability level rather than placement in a class containing a wide range of abilities. 
Subsequent research in the US by Lawrence and Winschel (1973) does not support 
the Ferri finding, and Nash (1973), in England, observed that often placement into a 
remedial class on entry to а comprehensive school was based on the criterion of 
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behaviour as reported by the teacher who previously taught the child rather than 
specific learning difficulties. 

It is evident from the research of Nisbet and Entwistle (1966) that the period of 
transition from primary to secondary school is one which can have profound effects 
on the individual child, affecting both his academic performance and personality in 
different ways. After inspecting children's essays on the subject of transfer, Murdoch 
(1966) observed that " as many as 57 per cent of the boys and 64 per cent of the girls 
had experienced identifiable problems in adjustment. However, after a term or more 
in the secondary school, about 80 per cent of the children then preferred it to their 
primary school " (Nisbet and Entwistle, 1969, p. 29). 


It would seem, then, that a number of factors including type of school organisa- 
tion, teacher attitudes and ability level of the child interact in a complex fashion, 
influencing the pupil's self-concept. In view of these considerations, there appears 
little doubt that the act of transferring from a small school, which orientates itself 
around the needs of the younger child, to a much larger school, which caters for the 
needs of the adolescent, will effect a considerable change in the child's view of himself 
in relation to members of his new school society. It can be expected that the experience 
of transfer of school will also affect а child's self-concept level. 


The bulk of the research discussed above has been concerned with overall changes 
in self-concepts of whole groups of children irrespective of level of self-concept before 
transfer. The present research was concerned that treating the children in a group 
might mask the possible differential effects on distinct sub-groups, for example those 
children with high self-concept scores before leaving primary school and those with 
low self-concept scores. It was felt that those children who enjoyed high self-concepts 
in primary school might do so partly as a result of their high level of achievement in 
the particular régime of their primary school. Less distinguished achievement at 
secondary school may result in members of this sub-group having a disproportionately 
larger decrease in self-concept with change in school. Correspondingly, the child 
who was regarded as socially and/or academically undesirable or deficient in, say, 
an unstreamed primary school may find himself happier in a streamed class at 
secondary school. Here he may be catered for better, with teachers who have 
understanding and sympathy for his/her special needs, and thus come to be viewed 
in a different light. His status, relative to his new group, might enable him for the 
first time to take pride in his work and in himself. This child may gain in self-esteem 
and related factors by the transfer. 


Bearing in mind, then, that for some children there might be an increase in 
self-concept with transfer to secondary school and that for other children there might 
be a decrease in self-concept with transfer, an attempt was made to identify such 
groups. It was hypothesised that those children with unusually high self-concept 
scores before leaving primary school might contain a significant number of children 
whose self-concepts are affected favourably by their high level of achievement in a 
relatively small school but who might suffer with transfer to secondary school for the 
reasons suggested, and that the group of children with particularly low self concept 
Scores before transfer might contain a significant number of children who might well 
benefit with the placement within groups that better match the child's particular needs. 


Finally, measures of attitude do give some indication of how an individual 
responds to particular aspects of his environment. In doing so, they afford a reflection 
of the environment as perceived by the individual. In the present investigation the 
Attitude to School Questionnaire S-7 (Barker Lunn, 1969) was used to give some 
indication of the ways in which children perceive their primary and secondary school 
environments. It was hoped that it would be possible to attribute differences in 
self-concept, at least in part, to differences in attitudes to school. 


68 Self-Concept and Attitude to School 


Certain factors have been found to be related to attitude to school, in particular 
the relationship between ability and attitude to school. It was found, not surprisingly 
резаро that the higher the ability, the more positive were the attitudes to school. 

one bears in mind the findings of Davidson and Lang (1960) and those of Cooper- 
smith, Piers and Harris, and Purkey (mentioned above) one might reasonably assume 
that tbe child's perception of the teachers' attitudes to him or herself, the relationship 
between this and the child's achievement, and finally achievement and attitude to 
school (e.g., Barker Lunn, 1970, who found that for 11-year-olds, self-concepts were 
significantly related to school achievement), together indicate an extremely complex 
relationship between self-concept and attitude to school. It would appear useful, 
therefore, to compare any significant differences in the attitudes to school of children 
who obtained low scores on the self-concept scale with those who obtained high 
Scores. As it has been established that differences exist between attitudes to school 
of boys and girls (e.g., Barker Lunn, 1969) they were treated separately. 


METHOD 
The sample 
A total of 88 boys and 94 girls from schools in a North of England conurbation 
took part in the investigation. The sample was divided into two groups: 


(1) Seventy boys and 65 girls aged 11+ years, who were in their final term at one 
of four junior schools (age range 5-11 years). These schools were all in the catchment 
area of a single 11 to 16-year-old secondary school. These subjects comprise the 
Experimental Group. 

(2) Eighteen boys and 29 girls aged 11+ years, who were in the third term of 
their second year at a middle school (age range 9-13 years). The catchment area of 
this school was judged to be socio-economically similar to that of the four junior 
schools, on the basis of parental occupation. These subjects comprise the Control 
Group. 


The instruments t 
Two instruments were used in the investigation: 


(1) the Piers-Harris Children's Self-concept scale (Piers and Harris, 1964), and 
(2) the Attitude to School Questionnaire S-7 (NFER, undated). 


The Piers-Harris Children's Self-concept Scale is a self-report inventory consisting 
of 80 first-person statements to which forced-choice responses are required. The 
scale is designed for subjects in the 9-13 age range. It gives a total self-concept score 
(TSC) and six contributory subscores. For the present investigation, certain questions 
were modified so as to eliminate Americanisms. Further, in view of criticisms levelled 
at forced-choice inventories by Heim (1970, p. 89), responses were made on a 5-point 
Likert-type scale. 


Reliability: Wing (1966) reported a test-retest correlation of 0-77 for both 
2-month and 4-month intervals, using the original Piers-Harris instrument. The 
author presents the following test-retest correlation coefficient for the modified 
Piers-Harris instrument over 14 days: TSC = 0:69. The TSC coefficient reported 
here compares not unfavourably with that of Wing, and with a corresponding 
coefficient of 0:68 reported by Engel (1959) over a 10-day period. 


The Attitude to School Questionnaire S-7 is a self-report questionnaire, consisting 
of 64 statements, and is designed for children in the age range 9-11 years. It gives 
ten scales, which Barker Lunn (undated) designed as follows: 


A—Attitude to school 


B—Interest in school work 
C—Importance of doing well at school 
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D—Attitude to class 

E— Other ' image of class 

F—Conforming versus non-conforming pupil 

G—Relationship with teacher 

H—Anxiety in the classroom situation 

I—Social adjustment—getting on well with classmates 

J—Academic self-image 

Reliability: Test-retest reliability coefficients до not appear to be available, but 
Barker Lunn (undated) reported coefficients of reproducibility (where appropriate) 
of between 0-90 and 0-95. 


Hypotheses 
In view of the literature, the following hypotheses were proposed: 


(1) that there would be significant differences in the mean self-concept scores of 
children tested in their final year of primary school and again after one year 
of secondary school; 

(2) that there would be a significant increase in the mean self-concept score of 
children with low self-concepts in their final year of primary school and the 
same children retested a year later at secondary school; 

(3) that the difference in mean self-concept score at first and second testing, for 
children with high self-concept scores at primary school, would be signifi- 
cantly different from any corresponding difference in the remainder of the 
sample; 

(4) that there would be a significant difference in the attitudes to school of 
children with high self-concepts and those with low self-concepts. 


RESULTS 


The mean scores of the Piers-Harris Children's Self-Concept scale are shown for 
first and second testings, boys and girls, together with standard deviations, in Table 1. 


Table 2 shows the means and standard deviations of TSC scores for boys and 
girls at first and second testing, whose TSC scores at first testing were in the top 
quartile or the bottom quartile. 


Among the girls, the directions of the changes in TSC between first and second 
testing were consistent with Hypotheses 2 and 3 in that among the high TSC group 
the mean self-concept level decreased, whereas among the low TSC group it increased: 
Two-by-three analysis of variance indicated, however, that the changes in self-concept 
did not reach a level of statistical significance. Two significant F values were Е 
which indicated significant differences between the TSC scores between the high 
versus low groups, (F(2, 86) — 73-61, P « 0-01), and interaction effects (F(2, 86 — 9-32, 
P«0-01) Among the boys, there was a decrease in the mean TSC scores among the 
high self-concept group, as predicted, but the corresponding score for the low TSC 








TABLE 1 
MEANS AND STANDARD DEVIATIONS OF TSC Scores AT FIRST AND SECOND 
TESTING 
First Testing Second Testing 
Sample N X SD N x SD 


Experimental boys 55 42:56 35:15 53 42:2] 35:25 
Experimental girls 46 44:65 33-95 43 42:51 32:37 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS ОЕ TSC SCORES FoR Boys AND 
GELS AT FIRST AND SECOND TESTING, WHOSE TSC Scores AT First 
WERE IN THE TOP OR BOTTOM QUARTILE 





“Bows Tap ланце Midde 30% Bottom Quite 
First x 81-00 47-48 — 5:36 
testing: SD 10-69 12:00 24-80 

п 14 27 14 
Second X 56:57 48-85 — 5-37 
testing; SD 26:86 26:53 24-80 
п 14 27 14 
GIRLS Top Quartile Middle 50% Bottom Quartile 
First x 87:45 44:88 136 
testing: SD 18.15 14-95 16:36 
п M 24 11 
Second X 65-36 53.33 23.83 
testing: SD 25:31 20-42 18-24 
n 11 24 11 


group remained the same. Analysis of variance revealed only one significant effect 
(FQ, 104 — 98-09, P « 0-01), which indicated heterogeneity between the high, medium 
andlow TSC groups. Hypotheses 2 and 3 were not, therefore, supported. 


However, the sub-groups of high and low subjects at first testing were also 
treated as independent samples, and t-tests used to test the significance of differences 
between the scores for these subjects at first versus second testing (i.e., after change of 
school (cf. Table 3). The mean TSC scores for both the high TSC boys and the high 
TSC girls at first testing decreased significantly from first to second testing (t —2-64, 
df — 26, P «0-02 for the boys; t —2-64 df — 20, P « 0-02 for girls) (cf. Tables 2 and 3) 
and in addition, among the low TSC girls, the corresponding mean scores showed a 
significant increase from first to second testing (t - 3:04, df = 22, P<0-01). Each of 
these differences was consistent with the predictions made. 


At first testing there were five significant differences in attitude to school among 
high versus low TSC boys, which were in А (Attitude to school), B (Interest in school 
work), C (Importance of doing well at school) D (Attitude to class), and E (‘ Other’ 


TABLE 3 


DIFFERENCES IN MEAN TSC Scores BETWEEN FIRST AND 

SECOND TESTINGS FOR Boys IN THE ТОР QUARTILE, BOYS IN 

THE BOTTOM QUARTILE, GIRLS IN THE TOP QUARTILE, AND 
GIRLS IN THE BOTTOM QUARTILE AT First TESTING 


Boys 
(a) Top Quartile (b) Bottom Quartile 
df t Р df t P 
26 —2:64 «0:02 26 1-69 NS 
GIRLS 
(с) Тор Quartile (d) Bottom Оцагійе 


20 —2:64 «0-02 22 3:04 < 0.01 
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image of class), P beyond the 0:1 per cent level in each case (cf. Table 4). Among ще 
girls, significant differences were detected in С (Relationship with teacher) Н 
(Anxiety in the classroom situation in which a high score indicates concern not 
anxiety; a low score indicates anxiety or lack of confidence), and J (Academic 
self-image), each beyond the 5 per cent level of probability. At second testing, а 
different pattern of significant differences emerged (cf. Table 5). Among the boys, 
there were differences in E ( Other’ image of class), G (Relationship with teacher), 
H (Anxiety in the classroom situation), I (Social adjustment—getting on well with 
classmates), and J (Academic self-image), each significant beyond the 5 per cent level, 
or better. Among the girls, there were significant differences in A (Attitude to schoo), 
B (Interest in school work), E (‘ Other’ image of class), and I (Social adjustment 
each beyond the 5 per cent level, or better. 

Table 4 shows the difference in attitude to school scores for boys and girls whose 
TSC scores were in the top quartile versus those whose TSC scores were in the bottom 
quartile at first testing. 

Table 5 shows the differences in attitude to school scores for boys and girls 
whose 'TSC scores were in the top quartile versus those whose TSC scores were in the 
bottom quartile at second testing. 


TABLE 4 


DIFFERENCES IN ATTITUDE TO SCHOOL ОЕ Boys AND GIRLS IN 
QUARTILES AT Firsr TESTING 


Sample Score df t 

Experimental subjects (сш) 

High TSC vs. Low TS G 20 2:53** 
H 20 2.56** 
J 20 4.93***+ 

Experimental subjects Boys) 

High TSC ys. Low TS A 26 3-77 844" 
B 26 3-09*** 
C 26 3-23*** 
D 26 3n 
E 26 4.15**** 


*Р= «005; ** P = <0.02; *** P = «001; **** P = «04001 


TABLE 5 


DIFFERENCES IN ATTITUDE TO SCHOOL OF Boys AND GIRLS 
IN DIFFERENT QUARTILES AT SECOND TESTING 





Sample Score df t 

Experimental subjects (Girls) 

High TSC уз. Low TSC A 19 2.55** 
В 19 2-79** 
E 19 2-84** 
І 19 2:21* 

Experimental subjects (Boys) 

High TSC уз. Low TSC. E 22 2-77** 
G 22 2:29* 
H 22 Дај]. 
1 22 2:51". 
J 22 292 





*Р = «005; ** P = «002; *** P = «001; **** P = «0-001 
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DISCUSSION 

Hypothesis 1 

No significant differences were detected between the means for the TSC scores 
for all boys and all girls for first and second testings. The absence of significant 
differences may be attributable to the fact that the second testing took place towards 
the end of the third term at secondary school. By this time, the children may have 
adjusted to their new status and have re-established a previous self-concept level. 
This would be consistent with Murdoch's (1966) finding on level of adjustment. 


Alternatively, the results might be interpreted as suggesting either that the most 
important reference group are the child's peers, and that, perhaps owing to relatively 
little communication between children from different year groups, older children do 
not constitute ‘ significant others ' as far as most aspects of self-concept are concerned; 
or that interactions between children and teachers at secondary school (at least in the 
first year) are not greatly different from those at primary school, or both. Thus it 
may be that, even after changing school, children tend to maintain their self-concept 
levels. Itis not possible to establish whether the children maintained or re-established 
their self-concept levels since periodic testing was not carried out. 


Hypotheses 2 and 3 

These hypotheses were tested using two-by-three analysis of variance technique. 
This was obligatory in the case of the latter hypothesis because of its differential 
nature. The changes in mean self-concept levels among both the high and the low 
self-concept girls were consistent with the predictions made. This was also true 
among the boys with high self-concepts, but the mean at first and second testings for 
the boys with low self-concepts at first testing were almost identical. Only among 
the girls did the А х B interaction reach significance level. The relative mean scores 
for the boys, though not reaching the level of statistical significance, may be regarded 
as lending supportive evidence for the rationale underlying hypothesis 3. Hypotheses 
2 and 3 were both accepted among the girls, but not among the boys. 


Hypothesis 4 
(a) Girls when tested at primary school 

Three significant differences were detected in the attitudes to school of girls with 
high self-concepts and those with low self-concepts. The first concerned academic 
self-image which is perhaps not surprising since TSC includes a factor described as 
general and academic self-image, though product moment correlations indicate that 
the attitude score J and the subscale score for academic self-image are by no means 
synonymous. The relationship between academic self-image and self-concept would 
appear obvious in the light of the findings already cited (e.g., Coopersmith, 1959; 
Purkey, 1970). Jersild states quite clearly that “©... the cards are stacked against 
many children. They are stacked when teachers, in league with the prevailing 
competitive pressures in our society, attach greater importance to certain school 
achievements than they merit, and apply pressures which make the child feel that he is 
worthless in all respects because he does not happen to be a top performer in some 
respects" (1952, pp. 90-91). 


The second largest difference was in G (Relation with teacher), which measures 
* the teacher's perceived degree of concern for the child rather than the child's liking 
for the teachers " (Barker Lunn, undated). Barker Lunn (1970) among others has 
stressed the importance of the teacher and her relationship with the child. These 
findings are consistent with the results of Davidson and Lang (1960) who found that 
pupils were very aware of their teacher's feelings towards them, and that those who 
perceived the teacher as liking them were holders of more positive self-concepts. 
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Scale H (Anxiety in class), in common with scale J, is related to a contributory 
factor to TSC, namely Anxiety subscore. Significant differences in this attitude might 
well then be expected. Rushton (1966), reviewing the literature relating personality 
characteristics of children with scholastic success, states that in 70 per cent of the 
research, stability or adjustment is positively associated with academic achievement. 
He concluded from his own research with 11-year-old English children in their final 
year at primary school that “ the less anxious better adjusted child is most likely to 
succeed in school work at this аре”. 2 


Multiple correlation coefficients between TSC and Attitude to school were 
calculated for both groups and revealed quite different patterns of relationships. 
Thé results showed that among the high self-concept group B (Interest in school- 
work) and J (Academic self-image) together contributed 55 per cent to the variance. 
This suggests that success at school work is important to total self-concept, though 
neither was itself significantly correlated with TSC. 


Among the low self-concept children, a very different picture emerged. Here F 
(Conforming versus non-conforming) was significantly negatively correlated with 
TSC, and accounted for 41 per cent of the variance. Thus for this group the extent 
to which the child does not conform would appear to be positively related to her 
self-concept. This finding is Interesting in the light of Hargreaves' (1967) notion of 
the * delinquescent subculture’. Hargreaves suggested that, in secondary schools at 
least, pupils who do not succeed in school according to the establishment’s criteria 
adopt their own norms—a ‘ delinquescent subculture’. This may provide a source 
of attitudes and values with which it is possible to identify, and which may provide 
alternative criteria for measuring self-worth. 


Hypothesis 4, therefore, was supported amongst the girls at first testing in 
relation to these three scales. It is interesting to note that these three scales are 
included in a cluster that Barker Lunn has described as being concerned with “ social 
relations and the personality of the pupil" (undated) Differences in attitude to 
school among high versus low self-concept girls at primary school may be understood 
in terms of the acceptance of different social norms. 


(b) Girls when tested at secondary school 

А different pattern of significant differences were detected. Significant differences 
were found in A (Attitude to school), B (Interest in school work), E (‘Other’ image of 
class), and I (Social adjustment). In each case the girls with high self-concepts 
obtained higher scores on the attitude scales. 


With change of school there appears to be a change of attention among these 
extreme groups from attitudes concerned with social relations and personality to 
those directed specifically towards aspects of school (Scales A, B, and E). Certain 
factors may be operating for the first time following change of school. These may 
include entry into а much wider social environment, with renewed interest in school 
activities, the different nature of secondary school work, and a preadolescent growing 
awareness of the child's relations with others. This latter factor might account for 
the observed difference in I which is designed to measure the child's ability to * get on’ 
with other pupils in his class. Differences in this scale for high and low self-concept 
girls are consistent with Silver's (1958) findings that level of self-concept rating was 
directly related to a measure of interpersonal adequacy. 


Again multiple correlations between TSC and attitude to school for each group 
showed quite different patterns of relations. Getting on well with class mates appears 
to contribute positively to self-concepts among girls with high self-concepts, but 
negatively to those with low self concepts. "These differences might again be explicable, 
at least partially, in terms of Hargreaves’ ‘ delinquescent subculture ’. 
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The groups differed also in the relationship between TSC and Scale A (Attitude 
to school). Once again there was a positive, but non-significant relationship among 
the high self-concept girls, and a significantly negative relationship among the low 
group. Again the notion of the " delinquescent subculture ' may be involved. 


The present findings, associating low self-concept at secondary school with 
negative attitudes to school and low social adjustment, can be related to Thompson's 
(1974) observations that in secondary school, “ First-year maladjusted pupils have 
significantly less positive self-concepts than their well-adjusted peers ”. 


The other two significant differences in attitudes were in B (Interest in school) 
and E (‘ Other’ image of class). The relationship between high self-concept and 
interest in school is, perhaps, not surprising. Positive inter-relationships between 
achievement, ability and self-concept are well-established in the literature (e.g., 
Coopersmith, 1959; Piers and Harris, 1964; Purkey, 1970), but Barker Lunn was 
able to conclude from her study that ability alone is not enough for achieving well; 
motivation is also crucial. 


It might thus not be unexpected that girls with high self-concepts attach more 
importance to doing well in school than girls with low self-concepts. Again the 
differences between the two groups on scales B and E may be interpreted in terms of 
accepting or rejecting the values, goals, and rewards of the school. 


* Other ' image of class scale measures what the child believes others think of his 
class. The children in the present study were unstreamed both at primary and 
secondary school, and the high and low self-concept children were distributed evenly 
in the secondary school classes. The significant difference in mean score for these 
groups was due to the unusually low scores among tbe low self-concept children. 
It would seem, then, that the low self-concept girls were projecting their own image 
of themselves on their class, rather than reflecting the image of their class on themselves. 


Hypothesis 4 was supported. However, a different pattern of relationships 
between TSC and attitude to school emerged when the girls were at secondary school 
compared to when they were at primary school. Emphasis moved from personality 
and social relationships to attitudes more concerned with aspects of school. Once 
again though, differences between the attitudes of high versus low self-concept girls 
were relatable to differences in accepted social norms. 


(c) Boys when tested at primary school 

Five significant differences emerged when comparing the boys with high TSC to 
those with low TSC. "These were А (Attitude to school), D (Attitude to class), E 
(‘ Other ' image of class), B (Interest in school work), and C (Importance of doing 
well) A distinct pattern emerges in these results as all of these attitudes are concerned 
with aspects of school or school work. The relationships between high TSC and 
interest in aspects of school would appear to support Barker Lunn's (1970) conclusions 
regarding success at primary school. 

A multiple correlation between TSC and attitude to school scores shows among 
other things that ‘ other ' image of class again appears to reflect the low self-image of 
thelow TSC group asit did in the case of the girls at secondary school. Theimportance 
of the social adjustment score for these boys would seem to reinforce this notion. 


Hypothesis 4 was accepted, then, among the boys at primary school. It would 
appear that boys with high self-concepts held more positive attitudes to school and 
school work than their low self-concept counterparts. Self-concept among the boys 
with low self-concepts was related to two facets of social life, ‘ Other ' image of class 
and social adjustment. Whether these subjects held low self-concepts partly as a 
function of relatively little interest in school and school work, or of а poor social 
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image they had of themselves, is not clear. Most likely these two relationships are 
themselves not entirely unrelated. 


(d) Boys when tested at secondary school 

Again five significant differences emerged. These, in descending order of size, 
were H (Anxiety in class), J (Academic self-image), E (‘ Other’ image of class), 
I (Social Adjustment), and G (Relationship with teacher). In each case the high 
self-concept boys scored higher. 


It is consistent with the literature to expect that, as found, high self-concept boys 
would feel less anxious (Rushton, 1966), hold more positive academic self-images 
(e.g., Purkey, 1970), see themselves as getting on better with classmates (Silver, 1958), 
and feel that the teacher was more concerned about them than boys with low self- 
concepts (e.g., Davidson and Lang, 1960). Also it would appear to be consistent 
with both the literature and the results for the other sub-samples, that children with 
low self-concepts perceive others as having low regard for them (low ‘ Other ' image 
of class (e.g., Hargreaves, 1967; Nash, 1973), even though they were in unstreamed 
classes. 


А certain pattern can be discerned in the scales in which significant differences 
were detected in that, among these boys, they are related to the cluster which Barker 
Lunn described as “ more concerned with social relationships and the personality 
characteristics of the pupil". Thus, the cluster of differences which differentiates 
high from low self-concept boys at secondary school differs from the corresponding 
cluster for boys at primary school. In the latter case it was noted that the cluster 
related to aspects of school and school work. 


Hypothesis 4 was accepted among the boys when tested at secondary school in 
telation to five of the ten scales. The five scales were all concerned with social 
relationships and personality. 


In conclusion, therefore, in no case was the hypothesis rejected for all ten of the 
attitudes to school scales. There was evidence for a certain degree of specificity as to 
which particular scales distinguished high from low self-concept children, apparently 
depending on (a) whether the sub-group comprised boys or girls, and (b) whether the 
testing occurred when the children were at primary or secondary school. 


CONCLUSIONS 


While, because of the very small sizes of the sub-samples, it would be inappro- 
priate to generalise the findings to wider populations, it is interesting to note that a 
curious pattern of differences did emerge. 

(1) Scales G (Relationship with teacher), H (Anxiety in class), J (Academic 
self-image) distinguished high from low self-concept scorers among girls when tested 
at primary school, but boys when tested at secondary school. 


(2) Similarly, scales A (Attitude to school), B (Interest in school work), and E 
(‘ Other image of class) distinguished high from low self-concept scorers among 
boys when tested at primary school, but girls when tested at secondary school. 


(3) There was no correspondence between the attitude to school scales which 
distinguished high from low self-concept scorers both at primary and secondary 
school among either the boys or the girls. As noted above, scales G, H and J are in 
the cluster concerned with social relations and personality, whereas scales A, B and E 
are in the cluster concerned with aspects of school. : 


Why transfer of school should accompany changes in the types of attitudes to 
school that are important to high as against low self-concept children, and why the 
observed changes should be in opposite directions for the boys and girls, are questions 
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to which there are no clear answers. To the extent to which the present observations 
can be regarded as reliable, they point to a need to investigate the differential impact 
of school on boys and girls. 
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CAPACITY AND STRATEGIES OF EDUCATIONALLY 
SUBNORMAL BOYS ON SERIAL AND DISCRETE TASKS 
INVOLVING MOVEMENT SPEED 


Bv D. A. SUGDEN AND SUSAN M. GRAY 
(Department of Physical Education, University of Leeds) 


SUMMARY. Serial and discrete tasks were employed to investigate movement speed in 
ESN boys. Different movement difficulties were presented and Fitts’ Law was con- 
firmed, with a linear relationship between movement time and information load. 
Reaction time was unaffected by an increase in the difficulty of the required movement. 
Measures of capacity placed the ESN boys at least 5 years behind chronologically age 
matched normal boys. Performance was lower than that of normal boys yet strategies 
used on the serial task followed similar patterns. 


INTRODUCTION 


VARIOUS researchers have described the motor performance and learning of mentally 
handicapped persons (Malpass, 1959; Rarick et al., 1970). Using factor analytic 
techniques, Rarick and Dobbins (1972) and Rarick and McQuillan (1977) found that 
the factor structure of motor abilities of both moderately and severely mentally 
handicapped children was very similar to that of intellectually normal children. In 
terms of absolute measures on such variables as body size, flexibility, skinfold thick- 
ness, strength, endurance and motor co-ordination, the mentally handicapped as a 
group recorded poorer scores than normal children. This inferior performance seems 
to be highly related to their limited intellectual development, for the greater the 
mental handicap, the poorer the motor skills. The traditional research approach in 
this area has been one of a “© product" orientation, that is, an investigation into the 
final movement outcome with the mentally handicapped consistently exhibiting lower 
and more variable performances than normal children. More recently, using verbal 
studies as models (Belmont and Butterfield, 1969; Brown, 1975), a “ process" 
approach has gained in popularity, with motor skills being broken into their com- 
ponent parts. This approach has led researchers to investigate the control of move- 
ment by the mentally handicapped (Sugden, 1978; Kelso et al., 1979). The work 
has involved slow paced movements and, while many motor tasks are of this nature, 
others are performed in time constrained situations. 


Speed and accuracy are components of movement that have been put forward 
to explain individual differences in motor performance, and can be quantified in 
relation to the amount of information to be processed. Fitts’ Law (1954) states 
that movement time is a function of the index of difficulty (ID) of a task, where 


ID = loge 24 and where А is the movement amplitude and W the target width. 


Thus, by varying 4 and W, it is possible to increase or decrease the difficulty of a task 
and observe the effects of this on movement time. Fitts’ Law has been found to have 
great generality and has provided us with clues to the motor control process. For 
example, the average time per movement for a serial task has been shown to be 100 
to 200 milliseconds longer than for a comparable discrete task. This is less of a 
difference than would be expected if every response on the serial task involved a 
separate reaction time. This implied that, to some extent, the processing of feedback 
data overlapped the processes involved in the control of movement, or that the move- 
ments were programmed in advance and executed independently of feedback. 

Fitts’ Law has been confirmed in studies involving children of various ages on 
both discrete and serial tasks (Kerr, 1975; Salmoni and Pascoe, 1978; Wallace et al., 
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1978; Sugden, 1980). Sugden (1980) found a developmental increase in the capacity 
of the motor system of boys and girls of 6, 8, 10 and 12 years of age. There was little 
overlap between the ages, and in terms of absolute capacity the 12-year-olds at their 
maximum level just overlapped the minimum level reported in studies involving adult 
subjects on both discrete and serialtasks. The difference in average movement time 
between the serial and discrete tasks in most cases was less than the reaction time 
preceding the discrete movement, indicating that some ongoing processing was taking 
place in addition to the control of movement. However, this did not happen with 
the most difficult movements, indicating that at this level only the control of 
movements was being processed. 


Wade et al. (1978) conducted two experiments on reaction and movement time 
with both normal and severely mentally handicapped adults. Using a response 
which could be broken down into a simple reaction time followed by a movement, 
the first experiment found the mentally handicapped to have slower and more variable 
reaction times than normals, with neither group showing improvement with practice. 
The difficulty of the movement had a significant effect on movement time, but not 
on the preceding reaction time. The movement times of the mentally handicapped 
were greater and more variable than those of the normal subjects, and the slope of 
the regression line was steeper, indicating a lower rate of information gained. The 
second experiment used a choice response paradigm and found that reaction time did 
increase with an increase in movement difficulty, so mirroring data previously obtained 
with normal subjects (Klapp et al., 1974). 


Wade et al. АЕ. confirmed that the severely mentally handicapped, while 
performing much lower than normal individuals on a motor task involving speed, 
did indeed show similar patterns of processing. The present study aims to extend 
this line of investigation using moderately mentally handicapped children and employing 
both a serial and a discrete task, thus enabling comparisons concerning capacity of the 
motor system to be made between the two tasks. In addition, the difference in time 
between the two tasks will allow certain speculations to be made regarding strategies 
used on the serialtask. This can then be compared with data previously obtained 
on normal children. 


METHOD 
Sample 


Twelve boys from a local ESN(M) school were selected to participate in the study. 
The headmaster was asked for ESN boys whose ability was not confounded by either 
a language or a behaviour problem. Eighteen boys within the required age range came 
into this category and 12 were randomly selected. The ages of the boys ranged from 
11 years 6 months to 13 years 10 months with a mean age of 13 years 2 months. IQs 
were either unavailable or, as most were taken at least three years before, of little value. 
However, current reading ages were available and these ranged from 5 years to 8 years 
7 months with a mean of 6 years 4 months. 


Apparatus and Design 

Serial task. This involved tapping with a pencil between two circles of varying 
sizes and distances apart drawn on paper. Each boy was instructed to place a dot 
alternately in each circle as many times as possible in the five-second period. After 
an explanation and a demonstration, a practice trial was given at a movement 
difficulty not used in theexperiment. There were six movement difficulties as measured 
by Fitts’ formula for index of difficulty, and at every level there were six trials of 
5 seconds duration. The six movement difficulties were as follows: (1) 4 = 16 cm, 
W = 4cm,ID = ЗЫ; (2) А = 24 cm, У = 4cm,ID = 3:585 bits; (3) А = 16cm, 
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W =2cm,ID = 4bits; (ФА = 24cm, W = 2 ст, ID = 4:585 bits; (5) 4 = 16cm, 
Я = 1 ст, ID = 5 bits; and (6) А = 24 ст, W = 1 cm, ID = 5:585 bits. 

Discrete task. The apparatus made by Parkway Electronics involved a reaction/ 
movement time facility with dual four digit timers and a range from 0-001 to 9-999 
seconds. There was one central manual lamp and a corresponding touch plate with 
removable 1, 2 and 4 cm tops and a 1000 Hz audio feedback annunciator. The task 
consisted of moving a metal stylus from one of two base points to touch plates of 
varying diameter. The two base points were 16 cm and 24 cm away from the target 
plates giving movement difficulties which corresponded exactly with those in the serial 
task. Each boy was asked to move from a base point and hit the target plate as fast 
as possible. An explanation of the task was given followed by three practice trials at 
a movement difficulty not used in the experiment. Following this practice, ten trials 
were given at each movement difficulty. 


On both tasks the boys were tested in groups of three, with two drawing while 
the third was involved in the experiment. Each boy was tested on three movement 
difficulties and then changed places with another boy who was drawing. This con- 
tinued until the three boys had completed the six movement difficulties. The order 
of presentation was randomised for each boy. 


RESULTS 


Movement times per response on the serial task were obtained by dividing the 
total time per movement difficulty (30 seconds) into the total number of hits in the 
target area. On the discrete task, reaction and movement times were displayed by the 
apparatus. Errors of execution, that is, missing the target area, were below 6 per cent 
for both tasks. 


Figure 1 presents the mean reaction time on the discrete task and movement 
times on the seríal and discrete tasks as a function of movement difficulty. Regression 
lines were calculated and are shown in Figure 1 with b — 0-216 for the serial task and 
b = 0:09 for the discrete task. On both tasks, as the movement difficulty was in- 
creased, there was a concomitant increase in movement time. А 6x2 repeated 
measures ANOVA was employed to analyse this trend. There was a main effect for 
movement difficulty, F(5, 110) = 47:51, P «0-001, and for task, ЕД, 22) = 30-29, 
P<0-001. However, there was also a task by movement difficulty interaction, 
F(5, 110) = 10-37, P «0-001, and this was further analysed using tests of simple main 
effects followed by Tukey's HSD. These tests revealed that at movement difficulties 
of 4, 4-585, 5 and 5:585 bits, the times on the discrete task were significantly faster 
than those on the serial task. Мо significant differences were evidenced at the two 
lowest movement difficulties. Within both serial and discrete tasks, there was a 
significant difference between all the movement difficulties. (P « 0-01). 


The discrete task provided an opportunity to consider the independence of 
reaction and movement times. Figure 1 shows that as movement time increased 
with movement difficulty, there was no significant increase in the preceding reaction 
time F(5, 55) = 0:67, P>0-05. 

On both tasks the capacity of the motor system was calculated by using the 


formula: 
Index of Difficulty (ID) 
Movement Time (MT) 





Capacity (C) — 


The results are shown in Table 1. 

In 1964, Fitts and Peterson reported that the difference per movement between 
serial and discrete tasks varied between 100 and 200 milliseconds depending on the 
difficulty ої movement. This was always less than the reaction time for the discrete 
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FIGURE 1 


SCORES AND LINEAR REGRESSION LINES FOR SERIAL, DISCRETE AND REACTION TIME TASKS. 


1000 






Serial Task 


? Discrete Task 


e—a moo Reaction Time 








200 
100 
3 36 4 46 5 56 
Index of difficulty (Bits) 
TABLE 1 
CAPACITY OF Boys ON SERIAL AND DISCRETE TASKS 
ID (bits) 
Task 3 3.585 4 4-585 3 5.585 
Serial 6:68 7.29 6:36 6:8 5.34 5.9 
Discrete 7-87 87 9.26 9-6 9-09 9-04 
TABLE 2 


DIFFERENCE BETWEEN SERIAL AND DISCRETE TASKS AND REACTION TIMES 
FOR DIFFERENT INFORMATION LOADS 


ID (bits) 
3 3:585 4 4:585 3 5:585 
р 68 80 197 198 387 328 
RT 325 320 327 317 318 341 


(m/secs) 


task, and the authors proposed that some processing, in addition to movement control, 
was taking place during а movement in preparation for the subsequent movement. 
Table 2 shows the difference (D) between the serial and discrete movement times for 


the same information loads. 


The reaction time for each particular movement 


difficulty is also shown. 
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DISCUSSION 


The relationship between amplitude and width and movement time has a great 
deal of generality and on the two tasks used in the present study is shown to be con- 
sistent with Fitts’ Law. The independence of reaction time and movement time was 
confirmed using an experimental paradigm in which the movement to be made was 
completely predictable. The fact that the subject knew in advance what response 
was required appears to be a key factor. Карр et al. (1974) also demonstrated that 
changing the difficulty of a movement response (in their case from '' dit ” to " dah ”) 
had little effect in a simple reaction time paradigm and proposed that this was because 
the subject can pre-programme the movement. 1f a choice response paradigm had 
been used, then the more difficult the movement the longer reaction times would be 
expected, representing the subject organising his response. Wade et al. (1978) using 
both simple and choice response paradigms confirmed this finding. Movement 
difficulty only raised reaction time in the choice response paradigm. 

Table 1 shows the discrete task to have a range of capacity from 7-9 to 9-6 bits 
and the serial task from 5:3 to 7-3 bits. Both tasks show a range of less than two bits 
per second, which is consistent with Sugden's (1980) study using normal children. In 
terms of absolute capacity these ESN boys' scores on the discrete task were similar to 
normal 6-year-old children, and on the serial task to normal 8-year-olds. The formula 


С = MT computes the capacity of the system ata given movement difficulty; use of 


the reciprocal of the slope has been used by other researchers (Langolf et al., 1976; 
Salmoni and Pascoe, 1978) and shows the capacity of the motor system over a range 
of movement difficulties, that is, the rate of information gained. Using this formula 
both serial and discrete tasks with capacities of 11-1 and 4-67 bits per second res- 
pectively were similar to the normal children aged 6 employed by Sugden (1980). 
Whichever formula is used, the ESN boys evidence at least a five-year difference 
between themselves and normal children. If опе is performing an information 
processing analysis of ESN boys' motor skills, then this deficiency in the capacity of 
the motor system puts ESN boys at a distinct disadvantage in motor skills involving 
speed and accuracy. The steepness of the slope of the line on the serial task would 
seem to indicate that ESN boys found the higher information loads differentially 
more difficult than did normal children. 


Speculations concerning strategies that the ESN boys might employ on the serial 
task can be explored by investigating the time difference between the serial and discrete 
tasks. These are shown in Table 2 along with the reaction time preceding the discrete 
movement. If the boys were treating the serial task as discontinuous, that is, they 
were moving to one circle, stopping, and planning the movement to the next circle, 
then the D score would be very similar to the reaction time, If, however, some plan- 
ning is taking place during the movement, then D would be less than the reaction 
time. The results are quite clear: at the low and intermediate levels of movement 
difficulty, there is much ongoing planning, such as prediction and anticipation of 
feedback. However, at the highest levels, and the main influence seems to be target 
size, the boys' total central capacity is required to control the movement and there is 
none spare to use for prediction and anticipation. And so, although the ESN boys 
are generally poorer and slower all round than normals they nevertheless appear to 
be using similar strategies to the normal boys to maintain their accuracy. This finding 
is interestingly analogous to that of Rarick et al. (1972) who find that although 
retarded children are quantitatively less able than normals the factor structure under- 
lying their performance is strikingly similar. To summarise, in tasks which require 
both accuracy and speed of movement, ESN boys fall well behind their normal peers. 
In order to maintain a set level of accuracy they have to move more slowly, and in 
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this particular study they appear to be at least five years behind normal boys of the 
same age. They appear to have particular difficulty with tasks requiring continuous 
movement, where it is advantageous to plan at least part of the sequence ahead. 
When the target is small and the distance to be moved is long there is little evidence 
that they can do more than concentrate on the movement in hand. However, they 
are not wholly incapable of planning ahead. Аз the task becomes easier they do 
seem to be able to evolve a strategy which includes some use of prediction and 
anticipation. 
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REMEMBERING IDEAS FROM TEXT: THE EFFECT OF 
MODALITY OF PRESENTATION 


Ву RUTH GREEN 
(Department of Psychology, University of Birmingham) 


SUMMARY. Forty-eight subjects (16-17 years old), randomly assigned to two recall 
conditions (immediate and delayed), were each given two prose passages to recall, one 
after reading, the other after listening. The passages were judged to be difficult in 
content and unfamiliar. An incidental recall procedure was adopted. Following 
Meyer's (1975) analytic procedures, recall protocols were scored for the presence of idea 
units, content units, relationship units and number of units at each level of the content 
structure. Number of errors were scored independently of Meyer's scheme. Аз 
hypothesised for a difficult recall task involving connected prose, results demonstrated 
the superiority of recall after reading, both in terms of total units recalled and in number 
oferrors, This suggests the value of more systematic research into the conditions which 
favour:one mode of presentation over the other. 


INTRODUCTION 


MEMORY processes are an essential part of the ability to acquire knowledge through 
language, and since much of our knowledge is acquired by listening to or reading 
connected discourse the recent interest in studying memory for this type of material 
is important. The trend has been facilitated by the development of a number of 
procedures, often based on linguistic forms of analysis, which allow the identification 
of ideas or semantic text units and the analysis of their structural relationships within 
a particular prose passage. 

The influence of text structure itself on recall has now been fairly clearly estab- 
lished, in particular the fact that main ideas are better recalled than secondary ones. 
For example, Meyer and McConkie (1973) analysed two descriptive passages into 
hierarchical content structures, subsequently finding that idea units highest in the 
hierarchies were best remembered. Johnson (1970) devised an operational technique 
for determining the theme of a passage; subjects cut out successively a quarter, a half, 
and three-quarters of the inessential constituents of the passage. The quarter remain- 
ing was hypothesised to be the most important and did in fact appear most frequently 
in recall. 


Prose analysis schemes also provide a valuable means of analysing and scoring 
recall in the investigation of the effects of variables other than passage structure. 
The variable with which the present study is concerned is mode of presentation, that 
is, whether the subject reads or listens to the prose passage with which he is presented. 


Modality comparisons in short term memory tasks involving lists of discrete 
verbal items have established the superiority of auditory presentation, especially in 
terms of a recency effect (Murdock, 1972). Studies involving recall of connected 
discourse have, however, more generally been interested in comprehension processes 
rather than in memory itself (e.g., Sticht, 1972). Here the experiments have supported 
the view that, at least for the mature reader, reading and listening comprehension are 
equivalent. These studies have usually tested comprehension by asking subjects 
factual questions about the text. Kintsch and Kozminsky (1977) addressed the com- 
prehension issue in a somewhat different way by comparing summaries produced by 
subjects after reading and listening to 2,000-word stories. Again no major differences 
between the two conditions were reported, though summaries after listening did соп-` 
tain more idiosyncratic details. However, since memory was involved only in 
summarising after listening (readers being allowed to refer back to the text as they 
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worked), this study cannot be directly related to those where memory for meaning 
after reading and after listening has been compared. 


Turning then to the relatively few studies within the area of prose memory 
research that have directly manipulated mode of presentation, the most obvious point 
to emerge is the lack of any systematic control of relevant variables in the reported 
work. Sachs (1974), for example, reported a comparison of the two modes of 
presentation but only tested for recall of sentences 0, 20, 40 and 80 syllables away. 
Kintsch et al. (1975) used very short paragraphs of either approximately 25 or 65 
words, and found no mode of presentation effect in free recall in terms of a pro- 
positional analysis. By contrast Smiley et al. (1977), using 400-word folk tales to 
compare free recall of idea units with seventh-graders, found that listening produced 
significantly better recall than reading. In both of these studies recall occurred 
immediately after presentation. 


Factors such as passage length, type of prose (e.g., narrative or descriptive), 
degree of difficulty, and length of delay between presentation and recall, will all con- 
tribute to any particular recall performance. Yet none has been investigated systema- 
tically with respect to mode of presentation, despite their obvious relevance in 
educational settings. In particular, subjects have not been presented with difficult 
texts or tasks for which recall after reading is more likely to be superior. In this 
situation, as pointed out by Kintsch and Kozminsky, either the content is likely to be 
unfamiliar, or, in the case of a narrative text, no well-established schemata are 
available. In either case the individual's greater control over reading rate compared 
with listening rate should favour recall performance in the former condition. 


The main purpose of the present study, then, was to compare recall after reading 
with recall after listening, using passages which subjects would be likely to find 
relatively difficult to understand. The difficulty of the task was further increased by 
adopting an incidental recall procedure. Under these conditions recall after reading 
was expected to be superior to recall after listening. Delay was introduced as an 
additional factor, since again this variable has received little attention in terms of 
mode of presentation. 


The passages and prose analysis scheme used by Meyer (1975) were considered 
suitable for the stated objectives. Meyer analysed passages into an hierarchical 
content structure of semantic elements; ‘ content units’ were organised according to 
their ideational prominence and related to each other by ‘relationship units’ which 
were also scored in recall protocols. Such a scheme makes it possible to assess both 
the actual content of the material that subjects remember, and in addition, the extent 
to which they organise that material into a coherent whole. 


METHOD 

Subjects and design 

The subjects were 48 members of the sixth form of a local grammar school. 
Their ages ranged from 16 to 17 years. They were randomly assigned to two main 
experimental groups (immediate and delayed recall), and were then divided further 
into sub-groups of six. Within each of the main experimental groups each of the 
four sub-groups received a different combination of passage order and mode of 
presentation order. 


Materials 

The two passages chosen for presentation were those entitled ‘ Colour of Para- 
keets ’ (Passage A) and ‘ Treatment of Psychological Disorders’ (Passage B) from 
Meyer (1975). Both passages contained approximately the same number of words 
(638 and 641), the same number of levels in the content structure hierarchy (9 and 11), 
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and also yielded on analysis by Meyer's procedure a similar number of idea units (237 
and 257). A full content structure analysis of these passages together with details on 
how the analysis is performed can also be found in Meyer. The content of both 
passages was believed to be equally unfamiliar to the subjects. Since the passages 
were also originally prepared for presentation to students at undergraduate level it 
was assumed that their content would prove to be relatively difficult for the subjects 
in this study, who were still at school. 


Instructions were given to subjects in written form. The visual presentation of 
passages was included in this format, while auditory presentation was in the form of 
а taped recording. 


Procedure 

Immediate recall group. No information was given to the subjects on the nature 
of the experiment or the task that they would have to perform, other than that they 
should listen to one and read the other of two short passages as instructed. Thus the 
subjects were not aware that total free recall was required until given recall instructions. 


Written instructions before the taped passage informed subjects that the tape 
would last approximately 5 minutes and that they should try to concentrate on what 
was being said as if they were in a lecture. In fact the tapes for both passages lasted 
5 minutes 45 seconds. For the passage to be read the instructions again indicated 
that subjects would be allowed approximately 5 minutes study time, but that they 
might use this in whatever way they thought most beneficial, provided they did not 
stop before they were told to do so. Again 5 minutes 45 seconds was allowed for 
study. А break of 1 minute was allowed between visual and auditory presentations. 


At the end of the second presentation subjects were immediately given further 
written instructions for free recall. These stressed the importance of writing down 
all that could be remembered from the passages in turn, in the order that they had 
been presented, and in sentence form. It was further indicated that it was not 
important to try to recall exact wording as long as the original meaning was retained. 
Isolated words could be included if their relation to the rest of the passage had been 
forgotten, but subjects should indicate this by writing " I remember . . . but cannot 
remember how it relates to other information." It was considered that this would 
allow maximum scoring on content units when the organisation of material was 
forgotten. Finally, subjects were asked to attempt approximations to difficult or 
unfamiliar words when they could not remember them exactly. These were later 
Scored as correct if recognisable. 


Subjects were allowed as much time as they required for the recall task. 


Delayed recall group. The presentation procedure for this group was exactly 
as for the immediate recall group. After both presentations, however, subjects were 
asked to leave and return after 2 hours. Normal school activities were resumed during 
this delay. 


Recall of passages after the delay was carried out under the same conditions as 
for the immediate recall group. 


Scoring 

Recall protocols were scored with the aid of the content structure of the relevant 
passage (Meyer, 1975), for the presence of total idea units, content units, and relation- 
ship units. In a similar way scores were also obtained for number of units recalled 
at each level of the content structure. These raw scores were converted into per- 
centages prior to the main analysis. 


An idea unit was scored as present as long as the wording in the protocol para- 
phrased that in the original. The presence of content units was scored regardless of 
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whether or not they were recalled in the correct relationship to other information. 
Similarly relationship units were scored as present if they correctly related to content 
units, and also if the use of an incorrect content unit indicated that the subject knew 
that the relationship existed but incorrectly remembered the details of the content. 


Finally the recall protocols were examined for errors and inferences indepen- 
dently of the content structure analysis. 


RESULTS 
Content structure recall scores 
Percentage scores for number of units recalled were analysed in terms of a mixed 
analysis of variance, four main factors with repeated measures оп one factor. Between 
subjects main factors were delay (immediate vs. delayed recall), passage order (A-B vs. 
B-A), and mode of presentation order (auditory-visual vs. visual-auditory). The 
within-subjects main factor was mode of presentation (reading ys. listening). 


For total idea units recalled this analysis revealed a significant effect of mode of 
presentation (Е = 33-7, df = 1, 40, Р<0:01). Neither delay nor the mode of 
presentation—delay interaction yielded a significant effect. The group means and 
standard deviations for main factor groups are shown in Table 1. 


These results indicate superior recall of the ideas in the passages after reading 
compared with recall after listening. Although the delay effect did not reach statis- 
tical significance the results here were in the expected direction, with delayed recall 
being poorer than immediate. 


A similar analysis of variance was performed separately on percentage scores for 
content units and relationship units. Again only a significant effect for mode of 
presentation was revealed in each case, with Е = 32:04, df = 1, 40, P<0-01 for 
relationship units, and Е = 28:34, df = 1, 40, P<0-01 for content units. Group 
means and standard deviations for percentage recall in these categories are also shown 
in Table 1. These results provide no evidence that the recall of content per se and the 
recall of the relational aspects of prose are differentially affected by either mode of 
presentation or the introduction of a delay between presentation and recall. 


Content structure levels 

All units can also be divided into categories on the basis of their level in the 
hierarchical representation of the content structure. Highest level (level one) units 
correspond to the most im E ideas in a passage, and on the basis of previous 
research would be expected to be more frequently present in recall protocols than 
units at lower levels (levels two to nine or eleven). A comparison of the extent of 
recall at each level for the different conditions of reading and listening was therefore 
also made. Mean percentage recall scores at each level are shown in histogram form 
in Figure 1. 


TABLE 1 
PERCENTAGE RECALL SCORES, MAIN FACTOR GROUPS 
Group Total Units Content Units Relationship Units 
x SD x 5 X SD 
Mode of presentation 
Reading 22:7 12-0 224 122 22:9 10-9 
Listening 14:0 6:9 147 7-0 13-1 72 
Delay 
јаје 20-4 10-5 204 10:8 20-4 9.9 
Delayed 16:3 10:5 16:7 103 15-8 10-5 
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FIGURE 1 
PERCENTAGE SCORES AT EACH CONTENT STRUCTURE LEVEL, READING VERSUS LISTENING 


со 
о 


о 
о 


READING 


p. 
о 


~ 
о 


(л 
Ф 
1. 
о 
о 
и 
о 
o 
Ф 
a 
ф 
сл 
о 
= 
c 
Ф 
о 
= 
Ф 
n. 


о 


6 7 8 9-1 Content Structure 
Levels 


с 
о 


LISTENING 


np 
e 


N 
о 


Percentage Recall Scores 


о 


7 8 9-1 Content Structure 
Levels 





It can be seen that no systematic pattern of recall was obtained in either condition. 
However, as expected, level one units were best recalled in both cases, averaging 
66 per cent over all recall protocols, compared with 18 per cent for the other levels 
combined. In order to discover whether patterns of recall across levels differed 
between conditions a Spearman rank order correlation was carried out between 
reading and listening recall scores. Ranked percentage recall per level correlated 
significantly, p = 0-67, P«0-05. Thus mode of presentation did not differentially 
affect the pattern of recall across levels of the hierarchy in the content structure. 
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Errors 

Accuracy of recall is a central issue if one accepts a constructive or reconstructive 
view of memory processes; errors, particularly of an inferential type, are to be 
expected in recall, at least under certain conditions. To complete the overall picture 
of recall characteristics it was therefore decided to examine recall protocols for errors 
of all types. As far as possible one error was counted as such if equivalent to one 
content unit in a content structure covering all additions to the original found in a 
particular protocol. Errors were in fact of a variety of types, including simple 
mistakes in remembering details such as dates, incorrect relationships between correct 
details, inferences, and even what appeared to be totally new information with no 
objective inferential basis in the original. 

An error score was calculated for each recall as * number of errors per 100 words 
of protocol". Comparing errors for the two modes of presentation, mean error score 
for reading was 1-64 and for listening 2-26. For immediate recall the mean error 
score was 1-91, and for delayed recall 1-99. А similar analysis of variance to those 
previously carried out was then used to analyse these error scores. This yielded 
a significant effect for mode of presentation only, Е = 4-36, df = 1, 40, P «0-05. 


DISCUSSION 


The main finding in this experiment was the superiority of recall after reading 
compared with recall after listening. This result was predicted on the assumption 
that subjects would find both the incidental recall task and the actual content of the 
passages difficult. Reports obtained after the experiment suggested that this assump- 
tion was in fact justified, especially for Passage B. Such difficult texts should demand 
of the subject a more active role in terms of allocating available study time to the most 
difficult and/or most important parts of the passage. In the present context control 
of reading rate in this way was possible, while listening rate was constant and fixed; 
in addition re-reading was possible while re-listening was not. 


An explanation for the contradiction between this and previous results is thus 
available. Taken as a whole these results suggest the value of more systematic studies 
of mode of presentation, involving experimental manipulation of factors such as 
degree of difficulty and degree of control over reading and listening rates. As well as 
actual content and its relation to an individual’s current knowledge, type of prose 
(narrative, descriptive, etc.) and passage length are likely to contribute to degree of 
difficulty. 

The difference in recall favouring reading over listening was subjected to further 
analysis in terms of categories of unit (content vs. relationship) and levels in the content 
structure. In neither case was evidence obtained to suggest that the main effect was 
limited to any one aspect of a prose passage. Rather it seemed to be an undifferen- 
tiated effect, working for important ideas and details, and for content as well as 
relational ideas. This result is in agreement with Smiley её al. in finding that mode 
of presentation does not differentially affect patterns of recall across levels of 
structural importance. 


The other main factor in the experiment, delay, did not produce such conclusive 
results, the effect being in the expected direction though not quite reaching significance. 
For this reason it was not considered worthwhile to proceed further with the analysis 
of recall at the various levels of the content structure. A longer delay than two hours 
would presumably have led to a clearer demonstration of a delay effect. 


Finally, with regard to errors and inferences, the significant result of the analysis 
of variance here confirms further the poorer recall performance in the listening 
condition. Yet this result seems to require an additional explanation, since less 
could have been recalled without subjects necessarily making more errors. While 
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more errors after listening could be the result of more active constructions due to 
poorer comprehension and storage, or more active reconstructions due to poorer 
retrieval, the result is also consistent with the adoption of a strategy of conscious 
invention (Gauld and Stephenson, 1967). This strategy is most likely to occur when 
subjects perceive themselves to be under pressure to produce something which is 
coherent and complete, as in the present case. Its adoption here, particularly in 
recall after listening, might also be favoured by subjects themselves being aware of their 
poorer recall performance in this condition relative to recall after reading. Whatever 
the explanation the result at least suggests that errors, especially those mot of an 
inferential type, may be too often ignored in considering recall of prose in ecologically 
valid situations. Examiners and teachers, for example, tend to mark scripts not just 
on the presence of correct material, but also with due consideration for the number 
of mistakes or inventions. These can be scored relatively easily and objectively, and 
taking them into account can only add to our knowledge of the ways in which prose 
material is remembered. 
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RELATIONSHIPS BETWEEN ANXIETY AND CLASSROOM 
BEHAVIOURS OF ADULT LEARNERS 


By Y. L. J. LAM 
(Faculty of Education, Brandon University, Manitoba) 


Summary. The relationship between three types of anxiety and eight types of classroom 
behaviours was examined. Data for the study were gathered through both questionnaire 
and observation at the beginning, midway through and during the final phase of the 
teaching-learning process from 81 adult learners attending courses at Brandon University. 
Findings from the study seem to hold promise of a better understanding of the intangible 
mental state of adult Jearners from their overt bebaviours as correlation and canonical 
variate analyses revealed close and direct relationships among anxiety and behaviour 
variables. Further, the communication anxiety theory seemed also to be verified as 
there was definitely a difference between high and low anxiety learners in the frequency of 
behaviours displayed. 


INTRODUCTION 


Wane literature abounds in the traditional experimental studies of anxiety, a domain 

' that remains largely unexplored, but assumes great potential for the advancement of 
knowledge and skills in the teaching of adults, is the nature of relationships between 
anxiety and the behaviours, verbal as well as non-verbal, of learners in the classroom. 
At present, fragmented pieces of information have been gradually accumulated from 
studies of interpersonal communication and classroom behaviours at different age 
levels. From the first category came the Communication Anxiety Theory (Beatty 
and Beatty, 1975) which proposed that anxiety as a negative reaction to the task of 
interacting with another person, or persons, generates strong avoidance tendencies. 
Thus a highly anxious person, given a choice, will avoid communication (Allen, 1976; 
Dorsky, 1977; Kent, 1971). 


From those dealing with classroom behaviours of young adults, Yoakley (1975) 
showed that student participation in classroom management led to an increase in 
positive behaviours. Feichtner and Burstyn (1974), while not describing specific 
behaviours displayed by adults, did identify four patterns that provided conceptual 
linkage to studies on communication anxiety theory. They postulated that children 
inthe classroom normally undergo tentative and testing behaviour stages before finally 
adopting one of the four mutually exclusive behaviour patterns, 1.е., ' Active 
Learning Pattern’, ‘ Passive Learning Pattern’, ' Non-participative Pattern’ and 
* Disruptive Pattern’. A greater degree of maturity and persistent sets of tendencies 
render adult learners less dependent on an instructor's mode of delivery (Kidd, 1967). 
However, the perceived compatibility between the inherent set and the existing 
learning environment registers the extent of adjustment along with a corresponding 
anxiety which adult learners experience. Selected single behaviours rather than 
specific behaviour patterns as described are expected to be consciously or unconsciously 
displayed. 

Three general questions were posed to detect the relationship between anxiety 
and behaviours of adult learners in the classroom. 

(1) Are anxiety and behaviours in the classroom related? 

(2) Can the selected types of anxiety and behaviours in the class originate from 

different causes? 


(3) Are there differences between high- and low-anxiety learners in the frequency 
of exhibition of classroom behaviours in all stages of instruction? 
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METHOD 
Sample 
Eighty-one adult learners from six educational courses in the areas of special 
education and educational administration constituted the total sample of the present 
study. All these courses had the same duration and frequency of meetings so that 
variations in anxiety level and frequency of behaviours could be placed on some 
common basis for comparison. 


The ages of these learners ranged from 18 to 62 with the mean being 30. The 
proportion of female to male learners was about two to one (53 vs. 28). Occupation- 
wise, 28 were senior university students taking their final year in the educational 
programme, 35 were teachers, eight were school administrators and the remaining ten 
had professions that were not directly related to education. 


Variables 
Types of anxiety had been derived from the factor analysis of a nine-item 
questionnaire in an earlier work (Lam, 1978). These items are: 


(1) To what extent are you contented with the general course outline? 
(2) To what extent are you contented with the direction that the course is taking 
up to this stage? 
3) To what extent are you certain that this course meets your needs? 
5 То what extent are you confident of meeting the course's demands? 
5) To what extent do you feel at ease regarding the instructor's general approach ? 
8 To what extent do you feel at ease interacting with your instructor? 
7) To what extent do you feel at ease interacting with your classmates? 
8) To what extent do you feel at ease in sharing your past experiences with 
others in class? 
(9) To what extent are you confident of your performance in the course so far? 


Each of these items was assessed by а five-point scale with point 1 on the scale 
denoting, for instance, ‘very discontented’, ‘ very uneasy’, ‘ very worried’, ‘ very 
uncertain’, to point 5 indicating ‘very contented’, ‘very easy’, ‘very relaxed ’, 
* very certain’, and the like. Learners were asked to circle a point that best reflected 
their feelings at that period of time. In view of the positive nature of the items, 
anxiety scores of learners were indirectly tabulated by consulting the reverse portion 
of the scale. For instance, if they expressed feelings of ease in interacting with the 
instructor by circling point 4 in the scale, their anxiety level in this respect was assumed 
to have a scale value of 1. 


Three factors emerged. Factor 1 was termed Anxiety about course arrangements 
as it encompassed stress with respect to course outline, direction of the course, 
accommodation of one’s requirements and the approach adopted by the instructor. 
Factor П was termed Anxiety about interpersonal process as it was accounted for by 
items measuring tension in interaction with the instructor, classmates and with ex- 
perience sharing. Factor Ш was termed Anxiety about evaluative outcomes as it 
embraced items tapping learners’ concern about their ability to cope with course 
demands and performance. 

Behaviour variables were chiefly selected and modified from the four learning 
patterns described by Feichtner and Burstyn. Eight such verbal and non-verbal 
behaviours were operationalised and employed: 


(1) Vague Questions refer to questions of a very general nature and/or they are 
not specifically related to the topic discussed. 

(2) In-depth Questions are questions directly related to the topic discussed as 
well as those that probed further into details. 


92 Anxiety т Adult Learners 


(3) Clarification of Assignments are questions concerned solely with the nature 
and refinements of the assignments. 

(4) Interaction with Instructors is confined to student-initiated verbal exchanges 
with instructors. 

(5) Interaction with Classmates refers to learners’ responses to the questions 
raised by classmates. 

(6) Experience Sharing refers to responses reflecting learners' past working 
experience relevant to the topic discussed. 

(7) Positive Responses include both remarks supportive of the instructor's com- 
ments at lectures as well as non-verbal (i.e., head nodding, smiling, etc.) 
reflecting approval of the instructor's viewpoints. 

(8) Negative Responses include both rejection or criticism of instructor’s explana- 
tions as well as non-verbal behaviours (i.e., head-shaking, frowning, etc.) 
that signify disapproval. 


Procedure 

Classroom behavioural data for the present study were collected by observation, 
and anxiety data by the questionnaire. Six observers were trained prior to assignment 
to the classes for observation. Through viewing video-tapes recording the first two 
sessions of the six classes the observers became acquainted with the adult learners. 
They then worked through exercises on the use of the observation charts employed 
for checking the frequency of occurrence of the eight learner behaviours under study. 


Behaviour data of learners were collected through time-sampling design. In 
brief, observations were made on the 3rd and 4th, 7th and 8th, 11th and 12th meetings 
ofthe courses. These meetings were considered to be important in the sense that each 
consecutive period of observation represented the beginning, midway and terminating 
points of the teaching-learning process. 


The anxiety questionnaire was administered to learners on three separate occa- 
sions to coincide with the three periods of observation. This allows for a more 
accurate assessment of relationships between behaviour and the cognitive affective 
conditions of learners at the selected points of measurements. 

Classification of learners into ‘ high’ and ‘low’ anxiety groups was based on 
their overall responses to the five-point scales assessing the three types of anxiety so 
that those with the average below the mid-point 3 were considered to have ‘low’ 
anxiety and those with the average above 3 were viewed to have * high ' anxiety. 


RESULTS 
Reliability of the anxiety instrument 
Test-retest reliability of this questionnaire yielded coefficients of 0-70 (between 
the first and second period), 0-71 (between the second and the third period), and 0:67 
(between the first and the third period). These seem to be satisfactorily high for the 
present exploratory study. 


Reliability of observation on behaviour data 

The coefficients of observer agreement for the six teams of observers were found 
to range from 0-68 to 0:85. This reflected a relatively strong agreement in observation 
and recording between members of the six teams. 


Anxiety and behaviours in the classroom 

Pearson product-moment correlations between anxiety and classroom behaviours 
at three stages of instruction revealed that they were significantly related (Table 1). 
In particular, during the beginning stage of instruction, Anxiety about course arrange- 
ments and Anxiety about evaluative outcome were positively related to negative 
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responses. In other words, higher anxiety in these two aspects restrained learners 
from voicing unsupportive remarks or gestures. Anxiety about the interpersonal 
process reduced significantly all the selected classroom behaviours with the exception 
of vague questions, assignment clarification and negative responses. Higher anxiety 
in this aspect prompted learners more often to raise vague questions and less frequently 
‘in-depth’ ones, to ask further clarification about assignment, to be less ready to 
interact with the instructor and classmates or to share experiences related to the 
course, and to display far less positive responses. 


Midway through the course, anxiety about the course greatly reduced the sharing 
of experiences in class; anxiety about the inter-personal process also affected learners 
in that they continued to raise vague questions, lowered their interaction with their 
classmates, their sharing of experiences and display of positive responses. Anxiety 
about evaluative outcomes seemed to have lost effect on any behaviour, reflecting 
possibly that learners were less concerned about formative assessment at this stage. 


In the last stage of the instruction three types of anxiety seemed to affect equally 
interaction with the instructors, with the classmates and experience sharing. In other 
words, higher anxiety about the course, the interpersonal process and evaluative 
outcomes markedly decreased their participation in class. 


To the question whether the selected types of anxiety and classroom behaviour 
originate from different causes or whether they are causally linked, canonical variate 
analyses were employed. Table 2 reveals that in all three stages of instruction only 
one dimension of relationship between three types of anxiety and eight types of class- 
room behaviours was found to be statistically significant. In other words, various 
types of anxiety could be directly linked to the classroom behaviours. Such an 
observation is extremely important, as it establishes and simplifies relationships 
between intangible anxiety and observable classroom behaviours. Given the relative 
permanence of such a relationship in a longitudinal perspective, an understanding of 
adult learners’ anxiety can be traced to reasons why adults do or do not behave in 
such a fashion in the class. In practice, the presence or absence of concrete verbal 
and non-verbal acts of learners provide vastly rich diagnostic materials to the 
observant instructors for assessing how successfully they are organising a course for 
adults. 


To the question whether high- and low-anxiety learners differ in the frequencies 
of exhibited classroom behaviours in all stages of instruction, it seems evident from 


TABLE 2 


CANONICAL VARIATE ANALYSIS OF ANXIETY AND CLASSROOM BEHAVIOURS AT THREE 
STAGES OF INSTRUCTION 














No. of Canonical Wilk’s 
Stages Traits Eigenvalue Correlation Lambda x2 df Sig. 
1 029 0-53 0-59 38:89 24 0.02** 
Beginning 2 0 12 0-35 0-83 13.39 14 0:49 
3 0-04 0-22 0-95 3-71 6 072 
1 0-26 0:51 0-60 37-43 24 0-04" 
Mid-way 2 0-11 0-33 0-81 15:06 14 0:37 
3 008 0:28 0-91 6:17 6 0-40 
1 0-27 0-52 0-61 3645 24 0-05* 
Last 2 0-14 0-37 0-84 12.79 14 0:54 
3 0-02 0-16 0-97 1:85 6 0-93 
* P«0-05 


** P«0-01 
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Table 3 that in almost all behaviours other than negative responses, low anxiety 
learners displayed higher frequencies than high-anxiety learners. Е ratios showed 
that, in the beginning stage of instruction, these two groups differ significantly in 
clarification of assignment, interaction with classmates and in the amount of ex- 
perience sharing. In the second stage no statistical differences were detected. In the 
last stages of instruction, low-anxiety learners again had significantly higher interaction 
with classmates and participated more in experience sharing. In general, then, the 
communication anxiety theory seemed to be verified. 


CONCLUSION 


The present analysis reveals that various types of anxiety and classroom be- 
haviours are highly associated with one another. While promoting or discouraging 
specific behaviours at different stages of learning, the three types of anxiety in general 
reinforced one another in explaining the display or absence of certain behaviours. 
The establishment of a direct causal relationship between the intangible emotional 
state of the learners and observable behaviours hopefully would sensitise instructors 
to adult learners! needs and capabilities in the process of acquiring knowledge and 
skills. 

To the extent that highly anxious learners display strong avoidance behaviours 
from active participation in class, instructors should no longer take for granted that 
absence of icipatory behaviours on the part of learners denotes that things have 
gone well for them. Rather, they should initiate greater interaction with highly 
anxious learners in order to identify problem areas. While additional research is 
urgently needed to substantiate these present findings, suffice it to say that the 
identified crucial connection between mental state of adult learners and their overt 
behaviours provides a new and exciting avenue of further exploration in this somewhat 
uncharted domain. 
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HOW WELL DO CHILDREN REMEMBER 
WHAT THEY HAVE RECALLED? 


By CATHY GOODMAN AND J. M. GARDINER 


(Department of Social Science and Humanities, 
City University, London) 


recall-recognition test, results of which showed that О children, aged about 6 years, were 
quite poor at discriminating between previously шы and non-recall items and that they 
were much less accurate than older children aged about 8 years or more (N — 14 per group). 
The findings suggest that young children may be generally quite inaccurate in assessing their 
previous recall performance, and this suggestion was discussed in relation to the development 
of other memory s 


SUMMARY. After a series of immediate free recall trials children were drem agod abo an unexpected 
list 


INTRODUCTION 


It is reasonable to suppose that some ability to assess how well a particular memory 
task has been performed is a necessary condition if the learner is to discover, adopt and 
evaluate different learning strategies. Older children are much more adept at using effective 
Jearning strategies than younger children (Brown, 1975; Flavell and Wellman, 1977), and it 
seems possible that this may reflect, at least in part, a growing tendency to assess previous 
memory performance accurately. As it happens, there appear to be only a few scattered 
reports of how well children assess their previous performance. And as Kail (1979) has 
recently concluded, these reports suggest (somewhat surprisingly) that young children 
distinguish quite accurately between remembered and non-remembered information. 


This paper provides further evidence about how well children remember what they have 
recalled. Farlier reports that include evidence on this question may have overestimated the 
accuracy with which children normally assess their previous recall performance. Masur et al. 
(1973), for example, asked their first grade subjects (average age, 7-2 years) to identify recalled 
and non-recalled items after a single free recall trial, and found that their identification 
accuracy for these two classes of item was 0:98 and 0-96 respectively. Though this suggests 
that young children experience little or no difficulty in remembering what they have recalled, 
clearly ceiling effects cannot be excluded as а reason for the observation. Moynahan (1976) 
had first and third grade subjects (average ages, 6:9 and 8:8 years) estimate the number of 
list items they recalled after each of a series of free recall trials and found that, though subjects 
in both age groups were substantially accurate in their estimates, the older children were 
reliably, albeit slightly, more accurate than the younger children. First grade subjects 
recalled an average of 4-60 list items and their average recall estimate was 4-72; third grade 
subjects recalled an average of 5:31 list items and their average recall estimate was 5:45. 
Lastly, in a recent paired-associate learning study, Bisanz et al. (1978) asked subjects 
which response items they had recalled, again on each immediately preceding trial. It was 
found that children aged about 6, 8 and 10 years, and college students aged about 19 years, 
could discriminate between recalled and non-recalled items with d’ scores of 3:24, 4:71, 5:16 
and 5:76 in age order. These d' scores indicate a very impressive level of accuracy, even in 
the youngest children. Indeed, the probability of correctly identifying a recalled item was 
0-96 in the youngest group, and even higher in the other groups. 

In both the earlier studies where between age comparisons were made (Bisanz et al., 
1978; Moynahan, 1976), the recall assessment task was given after each of a series of recall 
trials, thereby priming the children to assess their recall performance. Such a procedure is 
more appropriate for showing how well children can remember what they have recalled, 
when set to do so, than for showing how well they normally do remember what they have 
recalled. The two experiments described here adopted a procedure that avoids priming the 
recall assessment task: subjects were given only one, unexpected, recall assessment test at 
the end of the immediate free recall trials (Gardiner and Klee, 1976; Gardiner et al., 1977). 


The first experiment issummarised briefly. The subjects were children attending a summer 
mt scheme and 17 children in each of three age groups were selected. The average 
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age of each group was 6-8 years, 9-0 years and 12:9 years. Each subject saw 10 lists of nine 
items drawn from a set of 126 items that were all line drawings of familiar objects together 
with their verbal labels. Immediate free recall of these lists yielded recall probabilities of 
0:32, 0-53 and 0-67 for the youngest through to oldest groups. After the last recall trial, a 
* recall-recognition ' test was presented. In this test each subject was re-presented with all 
90 list items and asked to indicate which items he or she had recalled. Individual d' scores 
derived from hits (recalled items correctly judged to have been recalled) and false positives 
(non-recalled items incorrectly judged to have been recalled) were calculated. Average а" 
scores for the youngest through to oldest groups were 1:70, 2:24 and 2:39, F = 4:35; 
df = 2, 48, P« 0:025. (The associated false positive rates were, in age order, 0-24, 0-19 and 
0-21.) Thus results show that after a series of free recall trials younger children were sub- 
stantially less accurate than older children in discriminating between recalled and non-recalled 
items. 

The main experiment we report was primarily designed to replicate this finding, though 
it incorporated several procedural changes that warrant a brief comment. List length and 
"type of materials were deliberately confounded with age group differences in order to equate 
the proportion of list items recalled by each group. Also, instead of line drawings of objects, 
only word lists were presented. 


METHOD 


Subjects 

The subjects were 42 children attending a local primary school who were selected in 
three age groups with 14 children in each. The average age of each group was 6:0 years, 7:9 
years and 10-0 years. 


Design and materials 

List words were selected from Carroll and White's (1973) age of acquisition norms. 
The youngest group were presented with a list set of 72 words which are acquired by the age 
of 4 years. The middle age group, were presented with a list set of 84 words, 40 of which 
are acquired between 4 and 6 years of age, and the other 44 of which were selected at random 
from the list set presented to the youngest group. The oldest group were presented with а 
list set of 90 words, 30 of which were acquired between 6 and 8 years of age, 40 of which 
were the subset of 40 words (described above) in the list set presented to the middle group, 
and 20 of which were selected at random from the subset of 44 words presented to the youngest 
group. 

These three partially overlapping list sets were arranged arbitrarily such that the list set 
presented to the youngest group consisted of eight lists of nine items, that presented to the 
middle group consisted of seven lists of 12 items, and that presented to the oldest group 
consisted of six lists of 15 items. List structure was not varied but list order and order of 
words within lists were randomised separately for each subject. After the immediate recall 
trials, the recall-recognition test was presented. In this test the appropriate list set was 
re-presented, newly ordered as a single list, and subjects were required to indicate which 
words they had recalled. 


Procedure 

The word lists were presented aurally at the rate of about one word every two seconds. 
Immediately afterwards, subjects were allowed about 90 seconds for recall, which was oral, 
before the next list was presented. To try to equate recall strategies all subjects were given 
modified free recall instructions and told to recall the last few presented words in each list 
first. After the final recall trial, the subjects were told about the recall-recognition test. 
Test items were presented on three stapled sheets of paper in upper-case typescript. The 
composition of each of the three sections of the test was constant but the order in which 
sections were presented was varied. During the test, the experimenter read each word aloud 
at therate of about one word every three seconds. Subjects were instructed to try to remember 
whether they had recalled each word as it was read out and to respond * Yes ' if they remem- 
bered recalling it, * No’ if they did not. It should be noted that here, as in the initial experi- 
ment, even the youngest children appeared to understand these task requirements without 
difficulty. 
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RESULTS 


Figure 1(a) shows average recall proportions as a function of age and serial position. 
Typical serial position effects are apparent for each age group and these data are reported 
fully for completeness of the record. It is more important to note that, collapsed over serial 
position, the youngest group recalled 0-29 of the list words, the middle age group recalled 
0-28 of the list words, and the eldest group recalled 0-29 of the list words. Thus the rather 
complex arrangements with respect to list length and materials were successful in equating 
the relative proportions of recalled and non-recalled items in each group. In measuring the 
accuracy with which children in each group could discriminate between recalled and non- 
recalled items it also seemed desirable to take into account possible serial position effects, 
and to this end the different list lengths used in free recall were collapsed into three blocks: 
positions 1-3, 4-6 and 7-9 in the youngest group; positions 1-4, 5-8 and 9-12 in the middle 
group; and positions 1-5, 6-10 and 11-15 in the oldest group. Individual d' scores derived 
from hits and false positives (as defined before) were calculated separately for each serial 
position block. The average d' scores are shown in Figure 1(b), where serial position "1" 
denotes primacy items, * 2' denotes middle-list items, and * 3 ' denotes recency items. It is 
apparent that discrimination accuracy substantially increases with age and also that, for all 
ages, recency items, although best recalled, were least well remembered as having been 
recalled (cf. Craik, 1970; also Gardiner ef al., 1977). Both age, Е = 3-40; df = 2, 39, 
P «0-05, and serial position differences, Е = 6:68; df = 2, 78, P «0-01, were reliable and 
there was no interaction between the two, F<1. The false positive rates for serial position 
blocks 1 through to 3 were 0:27, 0-23 and 0-25 in the youngest group; 0-21, 0-22 and 0-20 
in the middle group; and 0-16, 0-14 and 0-12 in the eldest group. Although there appears 
to be a tendency for older children to make fewer false positive errors, a separate analysis 
of these revealed no significant difference due either to age, В = 1:17; df = 2, 39, Р 0-05, 
or to serial position, F = 1:21; df = 2, 78, Р» 0-05. In all essential respects, then, these 
results replicate those of the first experiment. 


DISCUSSION 


Principal results of this study showed that after several free recall trials young children, 
aged about 6 or so years, did not discriminate all that accurately between recalled and non- 
recalled items and that they were much less accurate than older children, aged about 8 or 
more years. Both the pilot and the main experiment show this outcome clearly. These 
findings suggest that there are marked developmental changes in the accuracy with which 
children do assess their previous recall performance. 


The results reported by Bisanz et al. (1978) and by Moynahan (1976) also provide at 
least some evidence of increasingly accurate recall assessments with increasing age, although 
in each of these studies very much higher levels of accuracy were observed, especially in the 
youngest subjects. This apparent discrepancy between those results and ours could, of 
course, in some way reflect procedural differences between different paradigms. But, as 
noted in the introduction, in both those earlier studies the recall assessment tests were given 
after each recall trial. Such priming of the recall assessment task may not merely raise 
overall levels of performance. It may well attenuate age-related differences. Young children 
may be capable of assessing their previous recall performance quite accurately when set to 
do so, but they may be much less likely than older children to do so spontaneously. Thus 
it also seems possible that young children may be generally quite poor at assessing their 
previous recall. 

This suggestion is at least in good agreement with much else that is now known about 
the development of other memory skills. It makes good sense, for example, when taken in 
conjunction with evidence of developmental changes in prediction accuracy. Young children 
are not very accurate in predicting how well they are likely to recall or when they are ready 
for recall (e.g., Flavell et al., 1970; Levin et al., 1977). And if young children are usually 
also poor at assessing their previous recall, then they would also be less able to tell whether 
the consequences of adopting particular learning strategies are beneficial or detrimental. 
Hence the general tendency for older children to use more effective learning strategies might 
reflect, at least in part, а greater tendency to distinguish accurately between remembered and 
non-remembered information (cf. Kail, 1979). Even when children assess their previous 
performance quite accurately, however, they may not necessarily realise the value of 
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modifying their study behaviour in the light of that assessment. Indeed, in a finer grained 
analysis, Bisanz et al.’s (1978) observations suggest that there may be a considerable lag 
between the age at which subjects can most accurately discriminate between recalled and 
non-recalled items and tbe age at which they apparently utilise this knowledge to modify 
their study behaviour. 
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A READABILITY STUDY OF JUNIOR SCHOOL 
LIBRARY PROVISION RELATED TO CHILDREN'S 
INTERESTS AND READING ABILITIES 


Вх І. В. HILL 
(The Institute of Education, University of London) 


Summary. An analysis of the library reading material of a junior school was related to the 
reading abilities and interests of its pupils. The school's resources of 4,793 library books were 
classified into 50 categories in two roughly equal groupings, Fiction and Information. А 5 per 
cent sample, stratified by category and placement, was measured by a modified Fry Graph. 
Two hundred and fifty-eight children were assessed by Schonell Silent Reading Tests А and B 
and completed a questionnaire. The findings indicated that the readability levels of most of the 
books were considerably higher than the reading ages of the majority of the pupils. The results 
yielded a significant negative correlation between the children's professed interests and the 
subject matter of library non-fiction reading material. 


INTRODUCTION 


Research into the relationships between children and their use of books has traditionally 
investigated either the area of textual complexity or that of reading interests and habits. 
Readability studies have revealed numerous instances of incompatibility between students 
and their school texts, at both junior and secondary levels (Wulfing, 1938; Kennedy, 1974). 
The unreliability of individual teacher assessment of the readability levels of printed material 
has been demonstrated in the US (Chall, 1958) and Britain (Moyle, 1971). 


From studies of children's reading ipn те the evidence suggests that the major 
determinants of content interest are sex, age and intelligence, in that order, with the first two 
exercising rather greater influence (Terman and Lima, 1925; Jenkinson, 1940; Chall, 1958; 
Ashley, 1970). The link between interest and improved "reading comprehension is well 
established (Klare, 1963; Morris, 1963). Gilliland (1971) placed interest first of the three 
essential relationships between reader and book, preceding both perceptual skill and under- 
standing of language. He suggested that readability was a problem of matching: * On the 
one hand there is a collection of individuals with given interests and reading skills. On the 
other hand there is a range of books and other reading materials, differing widely in content, 
style and complexity. The extent to which books can be read with profit will be determined 
largely by the way in which the two sides are matched ' (p. 21). 


As no comparable study appeared to have been published, it seemed that useful insights 
might be provided by an examination of the matching effectiveness of one junior school in 
the two key areas: 

(a) Children's reading ages against readability levels of library books. 

(b) Children's interests against types of library fiction апа non-fiction topics provided. 


METHOD 


The study sample included 258 subjects, 137 boys and 121 girls, aged 7 to 11, excluding 
only absentees, from а junior school roll of 284 pupils. The school was situated in the 
favoured south-eastern region and included a wide range of socio-economic and urban/rural 
differences. The children took Schonell Silent Reading Tests A and B, chosen because of 
their excellent reputation (Vince and Cresswell, 1976) and because they had been used as 
validity criteria to establish the credentials of later tests. Pupils aged 7:6 to 9-5 years were 
were assessed by Test А, 9:6 to 11-5 years Бу Test B. 


The administration of the questionnaire (see Appendix) was undertaken by the author, 
who read the definitions of all the fiction categories twice before the subjects were asked 
three times to pick their first choice from the list diminished by their previous choices. No 
ratings or rankings were assigned to the three votes in view of the number of variables 
already built into the investigation. A similar procedure was then adopted with the informa- 
tion list. The school library resources of 2,017 works of fiction and 2,776 factual books 
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were classified into the 50 categories of the questionnaire. A 5 per cent random sample, 
stratified by category and placement, was measured by a Fry Readability Graph rescaled to 
provide readability levels in years and fractions of a year (Grant, 1976). 


The questionnaire on reading interests had been refined by pilot work using the popula- 
tion of three matching classes in a neighbouring middle school. It seemed that this work 
might help to establish the credibility of the questionnaire as it would not be possible to 
present figures on the reliability of the grouping of the books. Important changes in 
research design resulted. These included a structured attempt to create investigator/subjects 
заррогі eae a precise definition of all fiction and information categories to avoid confusion 
and overlap. 


RESULTS AND DISCUSSION 


The reading ages of the children and the readability levels of the library books formed 
the data first examined, The values analysed were for the frequency with which a child or 
a book respectively fell within a specific yearly reading age range. A comparison between the 
two distributions was made using the reading age of the child as the expected frequency of 
distribution whilst the adabiy level of the book provided the observed frequency. By so 
doing, it became clear that the supply of books was largely incompatible with the needs of 
the children (chi square >26, df = 8, P«0-01). The majority of the school population had 
re ages between 8 and 11 years whereas most books were of readability levels above 

years. 

The questionnaire results established ghost stories as the subjects’ fiction favourite. 
They also confirmed previous findings that adventure, mystery and animal stories are popular 
with children (Carsley, 1957; Ashley, 1970). However, these were only three among a 
larger number of preferences. The figures indicated a wide spread of interest over the 50 
categories of fiction and information, as the pilot predicted. Votes on the factual material 
followed sexual predispositions to make football and cookery the most sought-after topics. 


The data for children’s choices were compared to the provision of fiction and non-fiction 
books. In the case of fiction, over twice as many books were supplied for the least popular 
types of stories as for the favourite genres. However, a Spearman rank order correlation of 
— 0:27 does not reach significance ({ = 1:32, Р = 0:10). А similar process was adopted for 
information material in the school library. In this instance, the figures revealed a well- 
defined imbalance. There exists a significant negative correlation between choice and pro- 
vision (Spearman rank order correlation, — 0-65, г = 4-19, P<0-01). 

Generally the results seem to show that the school’s matching of literature to students 
had been ineffective to the extent that most of the books were beyond the reading competence 
of the majority of the pupils. In the area of subject matter, library provision tended to be 
inversely associated with the degree of interest shown by the children. Whilst the smallness 
of both the book and pupil samples suggests that these findings should be looked at with 
some caution, there is no evidence to indicate that this school is in any way atypical. The 
results appear to support earlier evidence that teachers are individually poor judges of the 
suitability of literature for their pupils (Lunzer and Gardner, 1979). They also suggest 
inadequate provision for the diverse interests of junior school children. If the situation 
discovered by the Whitehead study, Children and Their Books (1977), is to be improved, it 
may well be that efforts must be made to create new kinds of literature, meaningful to 
children, at levels they can assimilate. 
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APPENDIX 
INSTITUTE OF EDUCATION, UNIVERSITY OF LONDON 
SURVEY OF SCHOOL LIBRARY BOOKS QUESTIONNAIRE CONFIDENTIAL 
Code No. Boy/Girl. Birthday: Age: Class: 
TYPES OF FICTION Books 


1. Underline below the three kinds of fiction books you like to read most. 


ANIMAL PEOPLE FANTASY/MAGICAL FAMILY POETRY WAR 
HORSE LOVE/ROMANCE ADVENTURE BIBLE GANG 
ANIMAL MYTHS/LEGENDS HUMOROUS SCHOOL SPORT 
HISTORICAL PICTURE BOOKS MYSTERY BALLET PLAYS 
FAIRY/FOLK. SCIENCE FICTION WESTERN GHOST 


2. If your favourite kind of fiction is not in the list, please write it here: 


TOPICS oF INFORMATION BOOKS 

3. Underline below the three kinds of information or reference books that interest you most. 
PETS SPORT BALLET SPACE/ASTRONOMY OUR COUNTRY 
BIRDS CRAFTS FOOTBALL FAMOUS PEOPLE FASHION/COSTUME 
ANIMALS HOBBIES TRANSPORT ENCYCLOPAEDIAS ARTS/MUSIC | 
МАТОВЕ BATTLES COLLECTING PUZZLES/QUIZZES OTHER PEOPLES 
DINOSAURS SCIENCE COOKERY HUMAN BIOLOGY THE EARTH 
HISTORY 


4. If your favourite kind of information or reference book is not in the list, please write it here: 


Thank you for your help. 
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SUB-CULTURAL DIFFERENCES ON SBLECTED COGNITIVE TASKS 


Bv G. N. MOLLOY 
(Monash University, Victoria, Australia) 


SUMMARY. The present study Е some relationships between age, socio-economic 
status (SES), and ош task performance among 120 children from Grades 1 and 4. А 
battery of tasks differing in transformational requirements and ostensibly in cultural loading 
was administered to all d children. Analyses of results indicate that low SES children were more 
handicapped in some tasks of so-called reasoning ability when com: to middle SES children. 
But contrary to expectations, the test scores of the contrasted SES groups were more te 
in Grade 1 than in Grade 4, For the older age group, SES differences on the culturally loaded 
Peabody Picture а were apparent, whereas differences on the culturally- 
reduced Raven's Coloured ive Matrices were statistically non-significant. 


INTRODUCTION 


In a recent review of 50 years of intelligence testing, Vernon (1979) concluded a section 
on the alleged cultural bias of standardised ability measures by stating, “... it is just not 
true that the tests are unfairly loaded against particular sub-groups reared within western 
societies ” (р. 5). This conclusion was based on a series of studies conducted by Jensen 
(1974) i in which the Peabody Picture Vocabulary Test (PPVT) and the Raven Matrices were 
given to 668 white and 381 black children in Californian schools. On both tests the whites 
obtained higher mean scores and there were no other aspects (e.g., reliabilities and factorial 
content of tests and items) of the children’s responses to indicate any between-group 
differences. In other words, there were no items that were relatively more difficult for either 
group. 

What Vernon omitted to report was Jensen’s (1974) performance comparisons with 
Mexican-American children. When Mexican-American children were compared with their 
black peers they performed at a higher level on Raven’s Matrices and at a lower mean level 
on the PPVT. But when matched on Raven’s Matrices and compared with white American 
children, the Mexican-Americans performed significantly worse than whites on PPVT. In 
commenting on this finding, Jensen (1974) remarked: ~ This indicates that the Mexican 
group is somewhat handicapped on the culture-loaded PPVT relative to the culture-reduced 
Raven, but the Negro group із not" (p. 239). Thus it would seem that the question of 
cultural bias in standardised ability measures remains unresolved. 


The purpose of this study is to compare contrasted socio-economic groups across age 
on two ability measures (Raven's Matrices and PPVT) presumed to vary in cultural loading 
in а community in which cultural differences are assumed to be less extreme than Jensen's 
Californian samples. Also of interest are group performance comparisons on tasks varying 
in degree of conceptual elaboration or processing complexity. 


METHOD 
Children 


One hundred and twenty boys attending public schools in Edmonton, Alberta, were 
selected from Grades 1 and 4. Samples for each grade were i into high- and 
law-SES groups (М = 30). In Grade 4, the two groups. of children had comparable IQs 
(Lorge-Thorndike) and all had IQs below 100. The corresponding Grade 1 groups were 
comparable on a standardised test (Metropolitan Readiness) which is a variant of intelligence 
tests used by the public schools. None of the children's scores exceeded the fiftieth percentile. 
Thus the scores of both SES groups were comparable within each grade on these two 
measures of reasoning ability. 


Tasks 

A battery of tasks including Raven's Coloured Progressive Matrices and Peabody Picture 
Vocabulary Test (PPVT) was administered to all the children. The other tests included 
Figure Copying, Memory-for-Designs, Cross-Modal-Coding, Visual Sbort-Term-Memory 
(digits and objects), Colour Reading and Naming, Serial and Free Recall, and Digit Span 


` 
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(forward and backward). These measures and their processing characteristics are described 
elsewhere (see Das ef al., 1979; Molloy and Das, 1979). With the exception of Raven’s 
Matrices and Figure Copying, all tests were given individually. 


RESULIS AND DISCUSSION 


The mean and standard deviation values for all tasks are given in Tables 1 and 2 for the 
contrasted SES samples in Grades 1 and 4. All common variables for the two age and SES 
levels were analysed separately by two-way (Age х SES) analyses of variance. The results 
of these analyses are presented in Table 3. As expected, on all performance measures, 
Grade 4 children performed at a superior level when compared with their Grade 1 counter- 
parts. Within grades, however, certain SES performance discrepancies are apparent. 

The analysis of variance of Raven's Matrices indicates that the average performance of 
high SES groups is superior to that of low SES groups at both age levels. When the results 
are analysed separately for each grade (t-tests), the mean SES difference is more marked at 
Grade 1. The SES difference between Grade 4 groups was not significant (P > 0:05). In the 
present analysis then, the F-ratio (SES) acquires significance from the more pronounced 


TABLE 1 


MEAN AND STANDARD DEVIATION VALUES FOR GRADE 1 EXPERIMENTAL 
Tasks: Low AND Ніан SES RESPECTIVELY 





Low High 
Variable Mean SD Mean SD 
Raven's Matrices 13:77 2:01 15:37 3.21 
Figure Copying 5:47 1:45 5-80 121 
Memory for Designs 16:50 6:09 12-67 5-82 
Cross-Modal-Coding 770 2-61 6-63 3-31 
Visual STM —Digits 18-43 5-85 21:57 9-56 
Visual STM—Objects 14-43 5:40 1547 5-70 
Colour Naming 60-60 22-75 59-27 2147 
Colour Reading 94-23 50-41 70-90 2510 
Serial Recall 58-57 25:51 60-47 27:31 
Free Recall 76:17 13:86 76-93 410 
PPVT IQ 105-43 12-81 110-10 11-05 
Digit Span (Forward) 4-10 0-71 417 075 
Digit Span (Backward) 2:63 0-76 2:50 0-82 
TABLE 2 


MEAN AND STANDARD DEVIATION VALUES FOR GRADE 4 EXPERIMENTAL 
Tasks: Low AND он SES RESPECTIVELY 





Low High 
Variable Mean SD Mean SD 
Pius Copying Эз fs з 128 
igure Copying ` : | : : 
Memory for Designs 3-00 3:39 3:37 3:13 
Cross-Modal-Coding 16:47 3-12 16-80 342 
Visual STM—Digits 47-83 1198 51:87 1127 
Visual STM— Objects 23-47 5:33 25-50 6-06 
Colour Naming 36-40 7-03 36:37 6-39 
Colour Reading 24:37 3-13 23-53 3.95 
Serial Recall 84-73 9-84 86-30 10-87 
Free Recall 90-40 6-38 90-43 6:87 
РРУТ ІО 106-20 9.35 113-30 8:10 
Digit Span (Forward) 523 0-86 5.57 0.82 
Digit Span ckward) 3:37 0-61 3:40 0-62 
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TABLE 3 


SUMMARY OF Two-WAY ANALYSES OF VARIANCE FOR GRADES 1 
AND 4 (AGE) AND Нан AND Low SES 








F-Ratio 
Age x 
Variable Age SES SES 
Raven’s Matrices 292.22** 471* T 
Figure Copying 184-87** + + 
Memory for Designs 168-93** 3:90* 5:73* 
Cross-Modal-Coding 274-64** + + 
Visual STM—Digits 269.10** 3.89* + 
Visual STM—Objects 86-42** + + 
Colour Naming 63-00** + + 
Colour Reading 129-06** 5.48** 4:75" 
Serial Recall 50-33** + T 
Free Recall 45-70** + + 
PPVT IQ + 9-40** + 
SES Rating + 536-70** t 
Digit Span (Forward) 77.98** + T 
Digit Span (Backward) 39-58** + + 


*Р = <0-05; **P = «091; + non-significant values omitted. 


disparity at the Grade 1 level. These observations concur with Schmidt’s (1966) findings 
indicating that performance on Matrices is influenced more by schooling than by socio- 
economic factors. More recently, Vernon (1973) compared samples of high and low SES 
Albertan children (N = 198) on a battery of tasks including Matrices and found no class 
difference on this measure. 


For the Memory for Designs task both the SES main effect and the Age x SES interaction 

are significant. At Grade 1, the high SES group made fewer errors in performing the task 
ts and is clearly’ superior to the low SES group. The significant interaction 

reflects a ‘ smoothing effect ’ in performance between Grades 1 and 4. At Grade 4, the SES 
difference is no longer apparent. In fact, the difference slightly favours the low SES group. 


Memory for Designs and Matrices test scores were significantly correlated for both age 
levels, indicating that these tasks sample analagous abilities. The present data indicate that 
the difference between the performance of high and low SES groups on Matrices and Memory 
for Designs decreases with age. One way of accounting for this increasing congruence is in 
terms of the ‘ levelling effect? of uniform classroom instruction. Schmidt (1966) provides 
supportive evidence for the contention that schooling has a levelling effect on cognitive task 
performance. He compared children varying in school entrance age on Matrices and found 
that performance on this task was more influenced by the length of schooling than either 
socio-economic circumstances or chronological age per se. The present results imply that 
schooling rather than SES is an important environmental factor. 


High SES boys recalled more numbers in their correct serial position in the Visual 
Short-Term-Memory (digits) task than low SES boys. The non-significant Age x SES 
interaction indicates that the relative performance of the two SES groups does not alter as a 
function of age or schooling. When the test stimuli were pictures of common objects rather 
than digits, no differences were apparent. 


The Colour Reading task is at first blush a measure of response speed, since the children 
were instructed to complete this task as quickly as possible. In the analysis, the SES main 
effect is significant. More importantly, the analysis reveals a significant Age x SES inter- 
action. While both high SES groups are characterised by a shorter response latency, the 
SES effect acquires significance from the interaction. At ће Grade 1 level, low SES children 
are markedly slower in comparison to the high SES group. At Grade 4, the SES difference 
is negligible. It is obvious that at Grade 1 this task is measuring reading ability in addition 
to response latency. In the case of colour identification (Colour Naming) no SES differences 
were apparent. 
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At both Grade 1 and Grade 4 the mean performance on PPVT of high SES children is 
superior to that of the low SES groups. The Agex SES interaction was not significant. If 
one regards PPVT as culturally biased—in the sense that successful performance is largely 
dependent on the richness of the verbal environment—the present results lend support to the 
cultural or linguistic deprivation position (e.g., Lawton, 1968). 

Considering both age levels collectively, the high SES performed better on the reasoning 
tasks than the low SES groups. "Thus, the data are ostensibly consistent with Jensen's (1969) 
hypotheses. His model would predict that SES differences will be greater on Level M-type 
(reasoning) tasks, whereas Level I (memory) performance differences will be non-existent or 
insignificant, The results appear to be congruent with this position. Except for Grade 1, 
however, the evidence is hardly compelling. First, when Grades 1 and 4 are considered 
separately, only the Grade 1 results accord with the hypothesis that high SES children will 
be more proficient on Level II-type tasks. This was not the case for Grade 4 since, with the 
exception of the culturally loaded PPVT, SES differences on Level П measures are not 
apparent. Secondly, the hypothesis that high SES children with ТО less than 100 do less well 
on Level I tasks than low SES children in the same IQ range is disconfirmed by the current 
data. 


To return to the issue foreshadowed in the introduction, the present results lend support 
to the notion that certain standardised ability measures—PPVT in this case—are loaded 
against particular sub-groups in an English-speaking community. Indeed, the present data 
concur with the results of an earlier investigation of SES bias with children from the same 
cultural milieu (MacArthur and Elley, 1963). In that study, lower SES children were 
relatively less handicapped on ‘ culture-reduced ° tests (e.g., Raven's Matrices) than on more 
conventional verbal tests. MacArthur and Elley concluded that the Raven’s Matrices 
showed a high g loading, minimal relation with SES, no evidence of cultural bias by items 
and moderate correlations with academic performance measures. The results of the present 
study suggest that the PPVT is not unbiased with respect to SES sub-groups. Whether or 
not culturally-biased tests are better predictors of academic performance in a given culture 
is а quite different question. 
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Summary. Anthony's theory (1973) of the development of extraversion in children requires 
that there should be a negative correlation between extraversion and intelligence in adolescents 
over the age of 13 ог 14. То test this, Raven's Progressive Matrices and the Junior Eysenck 
Personality Inventory were administered to а Jarge group (802) of 15- and 16-year-olds, There 
was а significant positive correlation between extraversion and the Matrices, зо Anthony's 
theory was not supported. 


INTRODUCTION 


There is consistent evidence that for young children, up to about 11 years, there is a 
small positive correlation between extraversion (E) and academic attainment (Rushton, 1966; 
Savage, 1966), while for university students the correlation is negative (Savage, 1962; Kline, 
1966). This is readily understandable in terms of the different learning situations involved: 
in the young child, learning (and perhaps also the testing) is likely to be in a social situation, 
which would favour the extravert, while private study is the most important form of learning 
for the university student. In a recent article, Maqsud (1980) found a negative relationship 
between extraversion and attainment in 13-year-old Nigerian schoolboys; he attributed this 
to the style of teaching, which stressed ‘ serious academic concentration ’. 


More interesting is the relationship between E and scores on intelligence tests, and the 
theory generated about this by Anthony (1973). Mean E scores in children rise steadily with 
increasing age, reaching a peak at 13 or 14, then falling slightly; at the same age there is a 
tendency for the correlation between E and IQ to change sign from positive to negative 
(Eysenck, 1965). Taking this in conjunction with the fact that there is a steady decline in E 
score in adults from 16 to 69 (Eysenck and Eysenck, 1975), Anthony suggested that the cor- 
relations could be due to different rates of development of extraversion; if the average peak 
for E is about 14, the more rapid developers (who would also be brighter) would reach their 
peak earlier than this, and would have declined slightly by the time the less intelligent were 
reaching their peak some time later than 14. Hence there would be a positive correlation 
between E and IQ before 14, when the brighter were more advanced in extraversion, but this 
would reverse in sign after 14, when the slower children were still becoming more extraverted 
and the quicker were already. moving down from their peak. 


Whatever their sign, the correlations obtained have all been small, and they are unlikely 
to have any predictive value. The interest of Anthony's formulation lies in its implications 
about the nature of extraversion, if this IQ-linked developmental course is confirmed. 
Unlike other aspects of physical and mental growth, in which those who are more advanced 
tend to develop for a longer period and decline less rapidly, thus maintaining an advantage 
throughout life, the unusual development postulated for extraversion in Anthony's theory 
would suggest a characteristic which is an advantage during childhood but becomes a dis- 
advantage after puberty. This would suggest interesting socio-biological possibilities and it 
is worthwhile to attempt to confirm it. 
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By averaging figures given by Eysenck (1965), Anthony produceda set of E/IQ correlations 
for the years 11 to 16, which fitted the theory extremely well, starting at -- 0-24 at 11, passing 
through zero at 14 to —0-14 at 16, all moving in the right direction. However, the numbers 
of children, especially in the later years, were very small, the last figure being the average of 
four correlations each based on eight or nine children! Clearly in a problem like this, where 
we are concerned with the differences between small correlations, we can only make use of 
correlations with very small standard errors, і.е., derived from large groups. In a later 
article, Anthony (1977) used the data of 266 children who had been tested (E and IQ) at the 
age of 10-11, and again at 15-16. The correlation was + 0:142 at the earlier and — 0-088 at 
the later age. The latter is not significantly different from zero, but Anthony was able to 
show that IQ had a significant positive correlation (-- 0:175) with a relative fall in E score 
from the first to the second occasion. This gave support to the theory, but the fact that 
different tests of both intelligence and extraversion were used on the two occasions causes 
some difficulty for combining the results in this way, and it would be more satisfactory if 
there were a significant negative correlation at the greater age. The children in this group 
were all of above average intelligence, which would tend to reduce any correlation that 
existed over the whole range. 


Thus the weakness in the empirical data produced to support Anthony's thesis concerns 
the older age groups. The positive correlation at the lower ages has been demonstrated 
conclusively, using very large numbers, by Eysenck and Cookson (1969) and Entwistle and 
Cunningham (1968). We considered it worthwhile, therefore, to look at the relationship 
between IQ and E at the critical age of 15-16, using a large group of children covering a wide 
IQ range. We decided to use Raven's Matrices as the IQ measure, as it is the most widely 
used British group intelligence test. Some of the studies referred to have used verbal tests, 
but in the Entwistle and Cunningham study both verbal and non-verbal tests were used, 
and they had almost identical correlations with E. 


| METHOD AND RESULTS 

During the course of a larger investigation Raven's Progressive Matrices (Raven, 1947) 
and the Junior Eysenck Personality Inventory (Eysenck, 1965) were administered to 802 
(314 male, 488 female) fifth-year pupils, age 15-16, in 21 secondary schools in Essex, 
Nottinghamshire and Yorkshire. They were all administered by the same researcher, who 
followed a standardised procedure. Results were analysed using the SPSS computer package 
ve eats 1975). Correlations between the Matrices and the personality variables are shown 
in Table 1. 


TABLE 1 
CORRELATIONS (г) OF MATRICES WITH JEPI Scores 


Correlation between Matrices and: 





Group N Extraversion Neuroticism Lie Scale 
M 314 +0 093 -- 0-109 —0°145 
F ` 488 +0106 — 0.033 — 0:159 
Total 802 40.102 —-0-074 — 0-155 


The correlations with E are positive and the same for boys and girls. For the total 
group the correlation is significant at the 0:01 level. The relationship was examined for 
linearity of regression by the analysis of variance method; there was no suggestion of non- 
linearity (Е = 0:39; df = 4, 796; NS). The correlations with neuroticism and the lie scale 
are included in the table to show that they are similar to those obtained with 10-11-year-old 
children by Eysenck and Cookson (1969). 


DISCUSSION 
The finding of a small but significant positive correlation between E and IQ in a large 
group of children aged 15-16 is inconsistent with Anthony's interesting theory of the 
development of extraversion, at least in the precise form in which it was expressed. Most of 
the evidence now suggests а small positive relationship in children at all ages. This can be 
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explained, without recourse to any special theory, as due to the effects on intelligence tests 
of the anxiety and inhibition associated with the more extreme degrees of introversion; 
in the Junior Eysenck Personality Inventory there is a substantial negative correlation 
between E and neuroticism (Eysenck, 1965). There is not much information about the 
situation with adults. Ley ef al. (1966) found a significant positive correlation between E 
and Cattell’s B Factor in a group of 144 adults, but this can be explained as а straightforward 
age effect; the group had a mean age of 46 years, with & large range (standard deviation 13 
years). The score on Factor B would decline over this period, as would the E score, and this 
could account for the positive correlation. 


The change in mean E score with age, gradually rising in childhood and then declining 
in adult life, could reflect a changing self image as the child is brought increasingly into social 
situations, and the adult in later life gradually retires from them. 
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BLANCHARD, J., Comberton Village College. 


LEARNING IN THE SECONDARY CLASSROOM AND 
SCHOOL—INITIATION, DIRECTION, AND EVALUATION 


The focus in this paper was the transactions between teacher and learner, indicative of 
how each is involved in the moments or operations of initiation, direction and evaluation of 
learning. Past and current definitions and exemplifications of fruitful learning (see, among 
others, John Dewey, Homer Lane, George Kelly and Carl Rogers) are accepted: that 
fruitfulness in the process and outcome of learning is chiefly determined by the learner's 
identification with the purpose, realisation and assessment of learning. 


In the classroom 


A process was described whereby pupils initiate, direct and evaluate their work. This 
was set, for comparative purposes, against a mode of teaching whereby the teacher controlled 
those moments or operations in the pupils’ work. Video-tape was shown to illustrate the 
different emphases and styles of these alternative approaches to the organisation of learning. 
Whereas in other commentaries the actual relationship between teacher and learner has 
received priority attention and concentrated analysis, this commentary revealed incidentally 
the relationship between teacher and learner through specific study of their respective decision- 
making and control in critical moments and operations of learning. It was concluded that 
for the short-term aims of teaching within the classroom it was not altogether necessary for 
the learner to initiate, direct or evaluate learning in order to identify with its purpose, 
realisation and assessment: not all fruitful learning needed to be autonomous learning. 


In the subject department of a school 


А process whereby pupils are formally invited to contribute to the evaluation of their 
learning was described. This was set, for comparative purposes, against the conventional 
mode of education whereby pupils are formally excluded from the evaluation of their work. 
The consequences of the alternative approaches to evaluation were explored in relation to 
the initiation and direction of pupils’ learning. It was concluded that for the long-term aims 
of education within the school it was vitally necessary for the learner to evaluate learning in 
order to identify with its purpose, realisation and assessment: all fruitful learning which has 
relevance to the life, well-being and enhancement of the individual and society needs to be 
autonomous learning. 
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THE COMPREHENSIVE SCHOOL CURRICULUM FOR 14-16 
YEAR OLDS AND ASSESSMENT AT 16 


Aspects of Secondary Education in England, а. survey by HMI, was carried out between 
1975 and 1978. Teams of HMI visited 384 maintained schools and concentrated their 
attention on the education of fourth and fifth year pupils. They found that the public 
examination system dominates rather than serves the curriculum; that pupils are taking 
inappropriate courses; that ‘too much reliance is placed on examination results ’ and that 

* most pupils need stimulus and opportunity to take а more active part in their own education °. 
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The present public examination system of GCE O-level, intended for the top 20 per cent 
of pupils, and the CSE, intended for the next 40 per cent, leaves up to 40 per cent of our 
pupils with little or nothing to show for their years at school. The effects of the present 
system are divisive and damaging, especially to the average and below average children. 
The final year of compulsory education is so dominated by the examination system that little 
real education can continue after the first term. 


Now, the recommendations of the Waddell Report having been rejected by the Govern- 
ment and proposals for a dubious common system of examining at 16 being merely at the 
drawing board stage, the time for a totally new system is opportune. The Secretary of State 
for Education and Science has suggested that we look at the idea of a pupil profile being the 
culmination of secondary education for those who do not take public examinations. The 
Rutter report, 1500 Hours, stated; ‘ There are likely to be problems in a system which is 
geared to success in exams which are set at such a level that two-fifths of the school population 
are expected to fail’. "Two-fifths receiving a profile will have the same effect. 


It is essential to move to a position in which monitoring becomes part of the teaching 
process and is not an end in itself. It is equally imperative to move towards more pupil 
involvement in planning and organising their own learning experiences. How this might be 
achieved is outlined in Burgess and Adams (1980). They make proposals for validating pupil 
programmes and for giving national currency to School Statements for 16-year-olds. My 
concern here is to impress on psychologists the urgency for secondary schools to plan their 
own programmes and to assess outcomes other than through public examinations. 
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BEHAVIOURAL CONSEQUENCES OF PUPILS' 
CLASSROOM ATTRIBUTIONS 


Various reasons have been proposed for the difficulties experienced in replicating Rosen- 
thal and Jacobsen's self-fulfilling prophecy. One possibility arises from the Mexican- 
American flavour of their sample. Bart-tal (1978) has noted that social and ethnic groups 
with little access to significant power will manifest more externality and Phares (1978) that 
in achievement situations externals attend more to social than to task-generated ones. 
Accordingly Weiner's (1979) view that ‘ future expectations of success and failure [will] be 
based upon one's perceived level of ability in relation to the perceived difficulty of the task . . . 
as well as an estimation of the intended effort and anticipated luck’ was investigated in 
relation to the source of the perception of ability. 


The experiment was in three phases. А test consisting of all possible paired comparisons 
of eight statements reflecting Weiner's (1979) major attributions revealed zero correlation 
between success and failure attributions and concurrent validity with Crandall’s (1965) scale 
among third-year pupils in a middle-class school. In the second phase, sixth formers 
nominated O-level subjects as successes and failures. For internals success subject grades 
were, as hypothesised, more closely related to tested intelligence than to teachers’ е tations 
of O-level results and vice versa for externals, thus establishing the construct validity of the 
test. In the final phase fifth formers nominated expected success and failure subjects at O- 
level and the same hypothesis was entertained. 
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CALDERHEAD, J., Ph.D. University of Lancaster. 


RESEARCH INTO TEACHERS' PERCEPTIONS OF THEIR 
PUPILS: SOME CONCEPTUAL AND METHODOLOGICAL ISSUES 


Past research into teachers' perceptions of their pupils has frequently been guided by 
models of person perception or symbolic interactionism and has been stimulated by pragmatic 
concerns with identifying the skills of teaching or the determinants of pupil progress (see, for 
example, the work of Brophy and Good, 1974; Nash, 1973; Solomon and Kendall, 1977). 
However, it is here argued that the nature of teachers' perceptions has often been narrowly 
conceptualised and, together with the limitations of the methods and forms of data analysis 
adopted, has resulted in only one aspect of these perceptions being explored: namely those 
attributions relating to the broad generalisations which teachers make amongst their pupils. 
Recent research on teachers! classroom decision-making (McKay and Marland, 1978; 
Calderhead, 1979), adopting stimulated recall or teacher commentary procedures, suggests 
that teachers' perceptions of pupils may in fact be much more complex, many of the teachers' 
attributions being very specific and relating to individual pupils in the class. Given such 
complex perceptions, several questions can be posed which are of interest to both teachers 
and researchers. How are teachers' attributions formed? Are general attributions ab- 
stracted from more specific ones? How are those attributions of use to teachers? Are 
teachers’ attributions internally consistent and valid? It is suggested that further research 
requires to be based upon more appropriate models and methodologies than hitherto, and 
that it is in such a context that the value of attribution theory may be considered. 
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А SOCIAL PSYCHOLOGICAL PERSPECTIVE ON 
EDUCATIONAL TECHNOLOGY 


An increasing number of authorities believe that educational technology has failed to 
live up to its promise. These include the respected Carnegie Commission on Higher Educa- 
tion in the USA, and Clifton Chadwick (1979). Educational technologists themselves have 
put forward explanations, including teacher resistance, union power, and incompetent 
decision making; and have suggested solutions such as greater use of the ‘ systems approach ’, 
organisational restructuring, and implementation of a new technological model of education. 


These diagnoses and solutions have been analysed critically from a number of points of 
view (Champness and Young, 1979, p. 80). Firstly, there are a number of underlying 
assumptions which are not only incomplete but probably wrong. Secondly, the use of the 
term ‘ educational’ instead of the more apposite ‘ instructional’ technology may have led 
educational technologists to believe that their ideas have a wider application than justified. 
Use of a particular kind of language may also have led exponents into unwarranted techno- 
logical determinism. 

A number of educational reasons for bringing social psychological thinking into dis- 
cussion about the use and impact of educational technology were put forward. These, and 
an examination of the history of educational technology (Saettler, 1978), led to the conclusion 
that a conspicuous omission exists in the theorising and thinking of educational technologists: 
the social context within which learning takes place. 
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. Finally, after suggesting some simpler and more straightforward explanations for this 
failure, it was proposed that not until the list of disciplines shaping and influencing educa- 
tional technology includes the social sciences will promise be reflected in performance. 


REFERENCES 


CHADWICK, С. Е. (1979). Why educational technology is failing (and what should be done about it). 
c. Technol., January. 

CHAMPNESS, B. G., and Youna, I. (1979). Den limits on educational technology. In Proceedings 

270 the "European Conference on the Role and Value of the New Communication Techniques in 

о Education, Strasbourg, Sept. 1979. зо to appear in Eur. J. Education, 1980.) 

Tel P. (1978). The roots of educational technology. Programmed Learning and Educational 
Technology, 15, 1st Feb. 


Cowan, R., B.A., Ph.D., GERRARD, S., and КекснЕк, 5. Department of Child Development 
and Educational Psychology, University of London Institute of Education. 


WEANING CHILDREN AWAY FROM LENGTH OR ‘THROW 
AWAY THOSE CUISENAIRE RODS' 


One of the most easily replicated results of studies of children's number development 
із that young children answer questions about relative number as though they were questions 
about relative length. Some possible reasons for this misplaced faith in length as a guide 
to number are discussed. 


How to help children overcome this problem in the context of number conservation 
tasks appears to have been achieved by Bryant (1972). He found simply exposing children 
to the inconsistency of relative number judgments based on relative length resulted in dramatic 
improvements in their performance of number conservation tasks. Bryant's (1974) stronger 
claim that children fail number conservation tasks solely because they do not know how to 
resolve conflicts between length-based judgments and those based on one-to-one correspon- 
dence seems suspect (Cowan, 1979a). The conflict between incompatible number judgments 
may, however, underlie the developmental sequence between identity and equivalence 
conservation (Cowan, 1979b). Further support for this sequence and some evidence in 
support of the view that what children who pass identity conservation and fail equivalence 
conservation lack is a way of resolving this conflict was gained in а study to be reported. Аз 
children are commonly taught number conservation in infant school mathematics schemes 
(e.g., Fletcher Maths) this finding has instructional relevance. 


'To see how well Bryant's conflict hypothesis stands up in another area two studies of 
the effects of guidelines on children's number judgments were conducted. The results of 
these studies did not agree: in one study guidelines were found to improve children's 
judgments of both small and large number arrays, with performance on the latter closely 
fitting predictions served from the conflict hypothesis. In the second study guidelines did 
not improve performance on large number versions. Whereas the children in the first study 
were attending а school in which а * New Maths ' scheme was taught, children in the second 
study were drawn from a school which did not use such a scheme. The implications of this 
demonstration of the importance of educational environment were considered 
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DICKINSON, J. A., B.Sc. Dental Health Study, University of Cambridge. 


APPLYING SOCIAL PSYCHOLOGY TO DENTAL HEALTH 
EDUCATION FOR PRE-SCHOOL CHILDREN: SOME 
PROBLEMS AND SOME SUCCESS 


Most dental health education for the under-fives takes various forms of media presenta- 
tion of information and advice to parents. However it із misleading to hope that knowledge 
will lead to appropriate bebaviour (Rosenstock, 1974). Despite the efforts of dentists, health 
educationists and toothpaste promoters, there is still a high rate of tooth decay among pre- 
school children (Todd, 1973). One aim of a research project to develop a more effective 
dental health education (Craft et al., 1980) has been to design a teaching package ог pro- 
gramme for use in a variety of pre-school institutions to affect the dental health behaviour 
of children directly. 


The programme draws upon theory from behavioural and educational disciplines. An 
attempt is made to alter cultural norms of dental health behaviour by integrating the pro- 
e as naturally as possible into the routines of the nursery, playgroup or mother and 
toddler club and making the staff or organisers of these institutions the change agents because 
they are already known by, and influential with, the children and their parents. The 
programme utilises the following techniques held in social psychology to be important for 
behaviour change and learning: participation in groups, and activity learning, inclusion of 
significant adults such as parents and teachers, behaviour modification and structuring the 
situation so that the child is not led by complicated language to misconstrue the required 
behaviour. 


This paper concentrated on one aspect of the programme—teaching pre-schoolers to 
choose sugar-free snacks. Evidence gathered from pilots of the pre-school programme in 
nursery schools, playgroups and mother and toddler clubs in Cambridge and Peterborough 
was cited to demonstrate that the required snack choices can be established within the short 
term of the programme in the nursery setting. Problems that prevent the programme 
achieving its aim (with most subjects) of permanent change in snack eating habits, both 
within the institution and in the community were identified and discussed and suggestions 
were made as to how they may be overcome. 
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OBSERVING YOUNG CHILDREN AT HOME 


The problems and possibilities presented by observing young children at home were 
discussed. И was argued that there were important questions in the study of social and 
emotional development, in the development of language and communication skills and in 
the development of symbolic play which can be addressed far more successfully using home 
observations than in laboratory situations. The particular methodology employed to carry 
out such observations will depend crucially upon the particular research question being 
investigated. The methodological issues raised centre on 

(1) The effect of the observer's presence. Ways of minimising this are discussed with 

examples from research: use of automatic recorders; parents as observers; 
Observers as * participants "; use of semi-structured and interview situations in the 
home. 
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(2) The issue of whether a representative sample of child behaviour, or family interaction 
has been observed. "Ways of estimating the effects of different time samples/time 
of day samples are described, 

(3) Sample size. Home observations involve working with relatively small samples; 
however, for some research questions this is no disadvantage (examples given). 

(4) The choice of descriptive categories; this will obviously depend on the particular 
research issues being investigated. 

(5) The difficulty of describing interaction between more than two individuals; the 
importance of taking account of this is illustrated with examples. 
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AN ASSESSMENT OF ADOLESCENT COMPREHENSION OF 
SOME GEOGRAPHICAL CONCEPTS, USING PEEL’S 
THEORETICAL FRAMEWORK 


The research was carried out to здуває some of the problems which secondary 
school children have when interpreting maps. In particular, it was designed to test children's 
ability to perform four skills which are fundamental to the problem of map interpretation. 
These were: 


(1) the ability to orientate а two-dimensional image with a map; 

(2) the ability to orientate a three-dimensional image with a map; 

(3) the facility for appreciating complex shapes on maps; 

(4) the understanding of four of the basic * mechanical' map reading techniques of 
finding compass direction, horizontal scale, contour reading and co-ordinates on 
а map. 


These abilities were tested separately and analysed in relation to maturity of thought as 
conceptualised by Peel (1971, 1972), chronological age and general intelligence. 


The sample comprised 102 children (52 boys and 50 girls) from a large mixed com- 
prehensive school in a West Midland city. The ages of the children ranged from 12 to 154 
years. Allabilities within the school were represented. 


Performance in all tests was found to depend more on the general intelligence factor 
than on either maturity of thought or chronological age. Age and maturity of thought only 
appeared as influential factors in tests concerning two-dimensional orientation and complexity 
of shape. Boys were found to be better than girls at tests requiring orientational skills. 


Of the four skills tested, the children appeared best at appreciating complexity of shape. 
Only the least able could not deal with this aspect of map work although there was evidence 
to suggest that the intelligence factor became less important with increasing age. The 
problem of orientating two-dimensional information with a mapped outline was the second 
easiest task for the children. The four basic map reading skills were not shown to be well 
understood, direction and co-ordinates being the better two. The task most daunting to the 
children was the problem of comparing a three-dimensional mental image with the same 
information on a map. Implications of this study are discussed for geography teaching and 
suggestions are made for further research in this field. 
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NURSERY NURSES AND NURSERY TEACHERS: THEIR 
ROLE IN THE PRE-SCHOOL AND HOW THEY ASSESS 
CHILDREN'S VERBAL SOCIAL BEHAVIOUR 
Ratings of children's verbal-social ability assigned by nursery nurses in day nurseries 
were compared with those assigned by teachers and nursery nurses/assistants in nursery 
schools and combined nursery centres. Previous studies had combined assessments of social 
behaviour made by teachers and nursery assistants and this was questioned. 


Analysis showed that the nursery nurses in day nurseries tended to assign higher verbal- 
social ratings than other groups of staff. 


The ratings were correlated with the children's language test scores: these correlations 
were higher for all nursery nurse groups than for the teachers. Similarly, correlations of the 
ratings with direct observation measures of the children's behaviour were highest in the day 
nurseries and combined nursery centres, where most of the verbal-social assessments were 
made by nursery nurses/assistants. "These results indicate that the groups of staff were 
assessing different attributes when making their ratings. It was suggested that while the 
nursery nurses were assessing directly observable and measurable verbal and social behaviour 
the teachers were assessing a more complex verbal reasoning skill. 


The findings were discussed in relation to tbe NNEB's role in the nursery, suggestions 
for action research in day nurseries and a unified system of training. 


Hucuss, M. Department of Psychology, University of Edinburgh. 


RECORDING ADULT/CHILD CONVERSATIONS AT HOME 
AND AT NURSERY SCHOOL 


This paper described a technique developed for recording conversations between young 
children, their mothers, and their nursery teachers. Four-year old girls were asked to wear 
a special dress fitted with a radio-microphone, which they wore at home and at nursery 
School for several consecutive days. An observer was present in both places to write down 
the context of the conversations. The paper described various practical and methodological 
problems which arose, and how these were tackled. An example was given of the kind of 
analysis subsequently carried out on the transcribed tapes. 


Further details of the methodology can be found in Hughes et al. (1979). 
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WHAT'S SO HARD ABOUT TWO AND TWO? 


The question posed in the title arises from a study in which three- to five-year-old 
children were given simple addition and subtraction problems. Most children succeeded 
when the problems referred to specific objects, even when these objects were not immediately 
present (e.g., ‘If there was one child in а sweetshop and two more children went in, how 
many children would be in the shop altogether? °). However, very few children succeeded 
when the same problems were presented more formally (e.g., * What does one and two 
make? ' or * How many is one and two? 7). 


A theoretical position was put forward which emphasised that questions like * What's 
two and two?’ belong to the formal code of arithmetic. In order to operate with this code 
children must acquire translation procedures which enable them to translate from the formal 
code to their own informal knowledge and vice versa. Ways in which children can be helped 
to do this were described. 
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ABOUT THE PROCESSES THROUGH WHICH 
MATHEMATICS IS WON AND LOST 


Mistakes made by children doing and learning mathematics were examined. It was 
argued that these mistakes, made over a period of years, destroy confidence and create 
confused young minds. 


Methods of easing confusion and restoring confidence were discussed. 
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SOCIAL ASPECTS OF COPING WITH OCCUPATIONAL 
STRESS BY SCHOOL-TEACHERS 


А number of writers have emphasised the role of social support as a means of reducing 
the occupational stress experienced by school teachers. This paper discussed the relationship 
between social support and the coping strategies used by teachers in the light of an exploratory 
study which investigated the relative frequency with which 42 comprehensive school teachers 
reported using different actions to cope with stress. 


A. principal components analysis of the coping actions reported by the teachers in- 
dicated that they appear to cope with stress in three main ways: (1) by expressing feelings and 
seeking social support, (2) by taking considered actions to deal with the sources of stress, and 
(3) by trying to think of other things. It was thus suggested that for social support to reduce 
ue the type of social support received should match the type of coping strategy a teacher 

opts. 


Hitherto, social support has been largely equated with simply having someone to talk 
to about one's problems, and for this the provision of a formalised system of social support 
in schools involving a counsellor or senior staff has been envisaged. However, it is important 
to note that there are different types of social support which serve different ends. Ног 
example, the reassurance gained by talking to other teachers and thereby realising that they 
have similar problems may be contrasted with social interaction aimed at taking one's mind 
off work, such as playing bridge or chess in the lunch hour. Аз such, a formalised system 
of har „борроп, whilst desirable, should not allow other forms of social support to be 
overlooked. 


LINDSAY, С. Sheffield Psychological Service. 


А PROJECT BASED, PSYCHOLOGICAL SERVICE: 
IMPLICATIONS FOR PRACTICE 


Over recent years there has been a tendency for educational psychologists to increase 
the work they do with clients other than individual children and families, e.g., school and 
LEA systems. Two aspects of this trend were considered. First, the range of developments 
possible was exemplified by reference to some of the recent work of the Sheffield Psychological 
Service. Then the implications for these developments were examined, primarily from the 
practical standpoint. Issues examined included (i) the organisation of services necessary to 
meet objectives derived from this approach; (ii) the problems of overcoming possible role 
conflicts of the educational psychologist as researcher, innovator, local government employee 
and therapist; (ii) the evaluation of project-based work. 
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ATTRIBUTION IN A CROSS-CULTURAL SETTING—CASE 
STUDIES FROM ENGLAND AND SRI LANKA 


Research was carried out between 1976 and 1979 on children’s attributions for academic 
success and failure in one English school and two Sri Lankan schools (one urban, one rural). 
The age range under study was 5 to 14 years. Two questions were asked of the attribution 
process in each country: 


(1) Are there age-differences in the types of attributions used? 
(2) Are there age-differences in the ability to use the structure of compensatory causality 
for the integration of multiple attributions? 


The research data resulted in an extension of Weiner’s (1979) lists of attributions and a 
questioning of the usual categorisation of ‘ ability’ as internal, stable and uncontrollable. 
Ability attributions were themselves divided into three types—-performance ability, specific 
competence and general competence. Developmental trends in the different types of 
attributions were discussed. 


The commonalities between the two country samples were greater than their differences— 
but the differences included the greater use of the * motivational ’ and * facilities ' categories 
in Sri Lanka and the lesser consistency of the trends for the three types of ability attributions. 
These differences were related to the different social and economic contexts of the school in 
each country and to differences between the Sinhala and English languages. 

The use of the compensation structures showed clear developmental trends in both 
countries with the exception of one story administered to the Sri Lankan sample. The 
Toplica (ons of this finding were discussed from the perspective of cross-cultural Piagetian 
psychology. 
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ASSESSING SEQUENCE AND CONTEXT IN 
A FREE-PLAY SITUATION 


The attempt to understand, rather than just describe, a child’s behaviour involves the 
need to view it in its setting and to have as complete an account as possible of all factors 
which might affect it, For a nursery school child, speech must be included and this involves 
close contact with a child who is moving, often rapidly, from room to room or from outdoors 
to indoors. № such circumstances pencil and paper techniques have much to recommend 
them for convenience and minimal disturbance. With well-defined recording categories, 
reliability can also be high. 


А recording form has been developed in which events and context can be set side by 
side in a continuous record of a single child's behaviour divided into half-minute intervals. 
* Context’ includes place, companions, types of activity and social involvement. Some of 
these categories are modifications of those proposed by Tizard et al. (1976) and Roper and 
Hinde (1978). 
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А sequential transcript is made of events and those involving direct interactions are 
classified in such a way as to indicate apparent intent and effect rather than exact form. 
They include categories such as friendly, adaptive, annoying, contrary, attention-seeking, 
boasting, demanding. In addition there is a coding of features which increase the interest 
of a communicative act, e.g., ^ Come here, I’ve got a surprise for you? as opposed to * Come 
here’. Although subjective, these categories can be satisfactorily specified by example and 
precedent, as well as definition. 


The classified transcript of each observation (usually of 20 minutes duration) allows a 
quick assessment of the ‘flavour’ of a child's interactions with others. It also allows 
special examination of periods of * poor’ or * good’ social interactions and of the situations 
and events related to them. It is possible to consider significant ‘social’ contexts such as 
‘meeting with opposition’ or * attempting to get into а new game’ or ‘ being ignored’ as 
well as more obvious situational ones. 
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COGNITIVE SOCIALISATION OF FOUR-YEAR OLD CHILDREN 
IN NURSERY SCHOOL 


Following a previous study (Wilkinson and Murphy, 1976) into differential methods for 
enhancing cognitive growth in disadvantaged pre-school children, this study was concerned 
with the process of cognitive socialisation in a typical progressive nursery school. A test- 
Observe-test design was used to investigate the relationships between nursery school experience 
and changes in cognitive activity in relation to IQ, socio-economic status and sex. 


Twenty-nine four-year-old children were systematically observed for a four-month 
period in one nursery school using a Child Observation Schedule developed by the authors 
from previous work by Tizard et al. (1976). The schedule provided extensive information 
on children’s behaviour in a situation where the children were relatively free to choose how 
to spend their time. The adults were also observed on a similar basis to the children. 


The children were pre- and post-tested on a variety of cognitive measures—WPPSI, 
Reynell, EPVT and measures ої operativity. At the end of the observation period, the 
majority of children had improved their cognitive performance, the mean gain being 
significant for all measures. 


Of the three macro factors (IQ, SES, and sex), it was SES that was most closely related 
to the distribution of experiences children encountered in nursery school. The high SES 
group spent more time at cognitive stations and more time in verbal communications both 
with adults and other children. The low SES group spent more time alone, more time on 
social/personal activities, more passive time with adults and more time in solitary play. 


The cognitive performance of several children appeared not to improve. These ‘ losers ’ 
spent more time at physical stations, more time unoccupied and more time in non-play 
activities. The study also confirmed earlier findings of Bruner (1977) that several children 
(termed * flitters °) spent much of the time moving from one activity to another and missed 
out on certain aspects of nursery education. 


.. The study also provided tentative evidence for greater cognitive differentiation between 
pris than between boys. Both the ‘loser’ and ‘ gainer’ groups comprised more girls than 
ys. 


I 
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AN INVESTIGATION INTO PRIMARY SCHOOL TEACHERS' 
CATEGORISATION OF THEIR PUPILS 


McIntyre et al. (1965) indicated that teachers categorise on two main dimensions, 
academic and behavioural. Taylor (1976) found 50 per cent of teacher categorisations 
were based on academic attributes. А more detailed categorisation list from teachers was 
sought with items ranked in order of importance. Differences in ranking were looked for 
according to sex, age and social class of children; sex, age, social class origin and training 
of teacher. The effect of type of school taught in (church or state) and whether or not the 
teacher was a parent were also investigated. 


Goodacre (1968) found that teachers categorised children as coming from good/bad 
homes using cues such as clothing and parental care. Taylor (1976) believed that teacher 
expectations included pupils’ social class as а cue. This research was designed to test this 
view. 


The teacher group of 142 was taken from an opportunity sample of 35 primary schools. 
The categorisation list was obtained from 400 descriptive items obtained from school records 
and reports on 360 children. Questionnaires contained list of categorisation items (17) and 
Likert-type self-report scales for assessment of measures of authoritarian and expectancy 
effect. Teachers ranked items 1 to 17 in order of considered importance. Mean ranking 
scores were obtained for each item as were mean scores for measures of authoritarianism 
and expectancy effect of the teacher group. Correlations were sought and analysis of variance 
used to test for differences. 


There was found to be an ascending order of importance for items. Academic items 
came first, behavioural second, personal third, Leadership qualities and reports from other 
teachers were found to be least important as items for categorisation. 10 per cent of items 
used are academic, 40 per cent behavioural, 40 per cent personal, 10 per cent reports from 
other teachers. Do leadership qualities in children challenge teacher? Low placing of 
reports from other teachers may question stereotyping at this stage. 


Males ranked items in the same way as females (г 0:83). Teachers used items in the 
same order for boys as for girls. Males were inclined to structure work more than females. 
This group of teachers adopted a teaching strategy midway between democratic and 
authoritarianstance. Young male teachers (21-35 years) were more authoritarian than female 
teachers of the same age group (P«0-05). Female teachers in the middle group (36-50 
years) matched male teachers iu the same age group on measures of authoritarianism. 
There may be increasing female authoritarianism and declining male authoritarianism as 
teachers reach middle age, Teachers were not affected by social class of children when 
categorising but were aware of the reality of social class background. Middle-class origin 
teachers made the same ordered judgments on children as teachers of working class origin. 
Infant teachers categorised in the same way as junior school teachers (r 0-95). Infant 
teachers tended to react to cues (home background, etc.) more than junior teachers. Grad- 
uate and two-year trained teachers showed more authoritarian approach than others. 
Graduate teachers were less affected by expectation than other teacher groups (P « 0:05). 
Teachers with children of their own reacted to expectancy cues more than those without. 


All the teachers indicated a very strong tendency towards structure in school work. 
The easy-going, unstructured situation in primary schools is probably a myth. 

A negative correlation was found to exist between measures of authoritarianism and 
measures of expectancy effect. It seemed that the more inclined a teacher was towards 
structure the less inclined he was to accept expectancy cues from children. 

'The implications of these data were discussed. 
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PRIORITIES IN ADULT EDUCATION 


Approximately 300 adults, drawn frora two contrasting social class groups, living in 
ten areas of Edinburgh, were interviewed about their problems and their ability to tackle 
them. The objective was to discern priorities in adult education. 


Interviewees were asked to rate the difficulties and satisfaction which they expected to 
be consequent upon attempting to tackle their problems. The consequences which were 
assessed included movement toward, or away from, the sort of person they wanted to be, 
the reactions they expected from other people, and their subjective feelings of ability or 
inability to tackle their problems. 

The data highlighted а large number of priorities in the design of experience-based 
adult education programmes. But perhaps more serious were the implications for educational 
programmes in the area dealing with the explosive issue of civic attitudes and expectations. 
Many of the informants felt that the problems which plagued them could only be tackled 
through politico-bureaucratic activity and that they themselves were not the sort of person 
who could initiate such activity. Many, particularly those from working-class areas, held 
self images which were self-depreciating in the extreme. "They felt that they had no right to 
be listened to or make their views known. This applied to their relationship with local 
officials like their doctors, teachers and social workers, and, to an even greater extent, to 
their relationships with central and local government. 

While the data raised serious questions about the appropriateness of the civic attitudes, 
perceptions and expectations which inform the citizens of our society they also raise serious 
questions about more traditional forms of education. If one has no right to be listened to 
or to make one's views known what is the point of trying to make explicit things which are 
important to one and the strategies which would need to be engaged in order to solve one's 
problems? What, indeed, is the point of learning to read, write and communicate? 

In point of fact, the data showed that many of our informants would have weicomed 
opportunities to learn how to cope more effectively with their lives, but they showed that 
they were unwilling to attend formal educational programmes with a view to developing 
. Such competencies. 

The implications of the data for the design of educational programmes were explored. 
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BROADENING THE BASE OF EDUCATIONAL ASSESSMENT. 
SOME REASONS, SOME PROBLEMS, AND SOME SUGGESTIONS 


Most teachers, parents, pupils and employers think that secondary schools should be 
concerned with developing the whole person. Yet most secondary schools tend to concen- 
trate on academic goals. There are many reasons for this, but among them is the fact that 
teachers find themselves deflected from their goal by a number of sociological pressures. 
Among these is the fact that such qualities as initiative, leadership, or the ability to co-operate 
with others will only stand to pupils' credit in the scramble for jobs if these qualities supple- 
ment rather than supplant academic success. Thus one important leverage point for bringing 
education back into schools is to find ways of giving pupils credit for having developed these 
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qualities. This will have the effect of making it possible to harness the previously mentioned 
sociological forces so that they will push teachers and pupils in the direction in which they 
want to go instead of away from their goals. We would have moved some way toward in- 
venting a sociological steam engine composed of psychological parts. 

It appears that teachers are reluctant to have any truck with this suggestion. The 
reasons for this reluctance were d. 


Available procedures for measuring these wider qualities were examined. An educa- 
tional assessment service, mandated to develop appropriate assessment tools, was suggested. 


SwzEENEY, C. A., GORMLY, С. M. R., Foor, H. C., and CHAPMAN, A. J. U.W.LS.T., Cardiff. 


CHILDREN AS TEACHERS 


Because of larger classes which are resulting from reduced education budgets, attention 
was drawn to the structure of school classes. Specifically, mixed-age grouping may lead to 
а reduced work load upon the teacher if the teacher utilises the resource of child tutoring. 
Child tutoring usually implies the teaching of a younger child by an older child on a one-to- 
one basis. This practice has an established tradition, yet its use today in Western society is 
sporadic. Evidence suggests that children may benefit socially (Furman её al., 1979) as well 
as educationally from mixed-age interactions, and also that tutoring may benefit the tutor 
as much, if not more, than the tutee. Benefits to the tutor derive not only from role-enact- 
ment (which provides the opportunity for a child to see differing perspectives), but also from 
the fact that the tutor learns the material more thoroughly through teaching someone else. 
Tutoring may thus be most advantageous to those with behaviour problems or learning 
difficulties. Advantages to the tutee comprise more individual attention, and the possibility 
that the tutor may be better able to identify with and accommodate to the tutee. The 
evidence for child tutoring is persuasive and this paper recommended more discussion about 
formal or informal reintroduction of tutoring as an integral part of school life. 


The authors are at present undertaking research into the nature of mixed-age social 
interactions in play and in informal tutoring situatiens. The studies are conducted in a mobile 
laboratory on location in schools using observational techniques. Methodology and data 
from one of these studies were reported in this paper and they were discussed in relation to 
child tutoring. 
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REVIEWING PERSONAL PROGRESS OF UNEMPLOYED 
SCHOOL-LEAVERS ON YOUTH OPPORTUNITY PROGRAMMES 


The Youth Opportunity Programmes of the Manpower Services Commission aim to 
improve employability of 16- to 18-year-old first job seekers. The programmes offer planned 
work experience and opportunities for developing work related procedural skills and personal 
competences. There are considerable differences in the interpretation of aims, learning needs 
and objectives and in the balance of learning content, between and within the four main t 
of schemes which are based on employers’ premises, voluntary agencies/local government 
training workshops or further education establishments. 


* Reviewing personal progress ' is one of a series of projects initiated by the MSC for 
enriching work and learning experiences of YOP trainees. Reviewing is defined as: 


‘a process for generating qualitative information about an individual performance, 
personality traits, attitudes and feelings, revealed through specific personal experiences 
and used for specific purposes.’ 


A survey of reviewing activities in YOPs has revealed a great variety of reviewing 
purposes and methods, ranging from formal assessments by supervisors to more informal 
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and formative self-reviewing by trainees. Most of the sponsors could see a need for extending 
the scope, improving methods and crystallising purposes of reviewing, but also felt that 
developments were restricted by lack of expertise by staff, lack of time and the pressures of 
other priorities in the scheme, e.g. finding work experience placements, dealing with 
disciplinary problems. There was general agreement that the freedom of the sponsor to 
apply his own format of reviewing, appropriate to his perception of YOP aims, organisation 
and value system and the needs of his trainees, must be safeguarded. 

The current phase of the project aims to motivate and support the development of 
existing, sponsor-specific reviewing processes towards formative rather than merely evaluative 
purposes by: 

(a) initiating developments and identifying associated blockages and gateways in a 

number of pilot schemes; 
b) attending to training needs of staff involved in reviewing by/of trainees; 

(o designing а ‘ framework’ of reviewing aims and methods, enabling facilities and 

boundaries to provide a general map and guide lines; 

(d) evolving a strategy for the dissemination of reviewing processes to other sponsors 

of Youth Opportunity Programmes. 


Progress towards the above objectives was reported. 
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WALKERDINE, V., and CORRAN, С. The Thomas Coram Research Unit, Institute of Educa- 
tion, London University. 
MAKING IT COUNT: А SEMIOTIC ANALYSIS OF 
CHILDREN'S ACQUISITION OF THE * NATURAL " NUMBERS 

Counting is * fundamental* developmentally, educationally and in mathematics itself. 
Recent work on children's acquisition of number shows that children are counting as an 
algorithm to produce representations of number (Gelman and Gallistel, 1978), rather than 
judging numerosity or equivalence on the basis of the operation of one—one correspondence 
(Piaget, 1952). Most children already count before they go to school and they are continually 
called upon to count when they attend school. 

Investigations of children's concepts of number are typically experimental. What is 
least investigated are the processes occurring in the everyday contexts in which children 
acquire the facility with number they bring with them to the experiment. The investigation 
reported in this paper analysed those contexts. It showed: 


О the importance of the fact that cbildren learn to count as speech; 

(2) that learning to count is more akin to language acquisition than ontogenetic 
development through action; 

(3) that the child's transition from speech to the use of writing is crucial in providing 

the basis for arithmetic reasoning (cf. Olson, 1977). 
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Woop, D. J. Department of Psychology, University of Nottingham. 


CODING SOCIAL INTERACTIONS BETWEEN 
ADULTS AND CHILDREN 

There are now a bewildering number and variety of coding systems available to ге- 
searchers interested in the description, analysis and evaluation of interactions between adults 
and children. Each new research project sets out with its own specific questions and these, 
in company with the researcher's theoretical orientation or attitudes towards social policy, 
together with variations across different target environments, often generate a unique system 
for examining social interaction. 

Rather than simply enumerating yet another coding system, an attempt was made in 
this paper to identify some of the major factors that influence the way in which a coding 
system takes shape. Basically, attention was focused on four dimensions which intersect to 
produce a number of distinct types of coding system. "These were: 


(1) Stance (theory-laden; ethnological; hermeneutic; ethnomethodological) 

(2) Perspective (historical or futuristic) 

(3) Situation (natural, contrived or concentrated) 

(4) Focus (intentions of participants; frequency of certain forms of behaviour; 

correlations between intra- and inter-situational measures) 

Various systems of analysis were categorised within the framework established by these 
distinctions and an attempt was made to identify the common strengths and weaknesses 
associated with each type of analysis. 


BOOK REVIEWS 


° BROWN, G., and DzsronGEs, C. (1979). Piaget's Theory: А Psychological Critique. 
London: Routledge and Kegan Paul, pp. 196, c. £7-95, р. 23:95. 


After generations of psychology students have learned to recite the Piaget catechism, 
it takes courage to suggest that the dogma is mistaken. Heretical voices have been heard 
for some years and Margaret Donaldson recently concluded that *the evidence compels us 
to reject certain features of Jean Piaget's theory of intellectual development'. 


Brown and Desforges, following their contribution to a recent symposium in this 
Journal, make no attempt to summarise the vast quantity of empirical work inspired by 
Piaget's writing; they discuss well-selected representative studies, drawn from a wide range 
of sources, in terms of the degree to which they provide crucial tests of essential features 
of the Genevan theories. 


А thorough examination is made of claims that Piaget and his co-workers underestimate 
the intellectual abilities of children. Because of inherent methodological problems, empirical 
investigators are always vulnerable to ‘false negative error’. Too often (as Margaret 
Donaldson says) we infer failure to reason when we ought to infer failure to understand. 
It is ironic that Piaget propounds ‘а theory of cognitive development which gives little 
attention to language and yet which relies heavily on verbal behaviour for data’. Experi- 
menters' instructions and children’s responses are alike open to linguistic misinterpretation, 
and the demand that a child give reasons for his answer increases the possibility of erroneous 
inference of faulty concepts. As in the psychometric field, performance may be accepted 
too патуеју аз a measure of competence. Absence of evidence of reasoning is not evidence 
of absence of reasoning. 


A still more fundamental question is whether ‘stages °’ exist at all. It is maintained 
that the increasing use of Piaget's term ‘ décalage" to account for (or, more correctly, to 
describe) discrepancies in a child's conceptual development as shown in different situations 
renders the concept of ‘stages’ ‘ predictively useless and empirically unverifiable '——' at 
best . . . an oversimplification . . . and at worst . . . dangerously misleading’ as Entwistle 
has suggested. 

The authors agree that Piaget's theory ' has been fertile, and has raised fascinating 
questions ', in the exploration of which a mass of data has been collected, including *a 
prodigious body of transcultural material’. Their critical discussion of work in the latter 
category is admirably rigorous and logical, and convincingly challenges Genevan claims of 
transcultural generality in the sequence of developmental stages. Much of the empirical 
work, both by Piaget's disciples and by his critics, although methodologically ingenious, is 
criticised as being not theoretically crucial, and lacking in systematic exploration of alternative 
explanations of the phenomena observed. The authors conclude that the theory * where 
testable, proves inadequate, and is in many respects untestable'. Constructively (and in 
line with sound Piaget principles!) they focus on its deficiencies, contradictions and irregu- 
larities, but they accept that * in respect of some of the criteria by which scientific theories 
may be judged, it is enormously successful ', and they quote some of the practically productive 
basic principles—the view of cognitive development as a process of progressive emancipation 
from dependence on perception, the assertion that experience per se (without the advantage 
of predisposing schemata and retrospective interpretation) is not necessarily beneficial, the 
importance of cognitive dissonance as a means to induce cognitive conflict and eventual 
accommodation. 

In addition to presenting a balanced and well-informed view of Piaget's theory, this 
book enables the reader to form a schema which helps in evaluating the significance of 
research in this field, and which might be used by research supervisors to guide students 
into more significant areas of investigation. It will be difficult to argue for or against Piaget's 
theory without taking account of this well-documented and clearly written critique. 


WILLIAM CURR 
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CLIFT, P., CLEAVE, S., and GRIFFIN, М. (1980). The Aims, Role and Deployment of 
Staff in the Nursery. Windsor: NFER, pp. viii + 224, £675. 


This book is а report ої a DES sponsored project on the aims, roles, and deployment 
of teachers and assistants in 40 nursery schools and classes. Staff aims, and ascribed and 
perceived responsibilities, were investigated by interviews and questionnaires, deployment 
by time-sampled observations. An attempt was made to see whether training, expressed 
aims, and ascribed and perceived responsibilities influence staff behaviour. The research 
was implicitly addressed to the issue of whether there are clear advantages to employing 
nursery teachers rather than nursery nurses. 


Unfortunately, the form of the research does not allow this question to be answered, 
In order to do so, one would have to compare how nursery nurses and nursery teachers 
carry out a similar job, і.е., taking charge of a class; strictly speaking, one would also 
have to compare the development of the children in the two sets of classes. 

For this reason, only limited conclusions about how things work out now can be drawn 
from the findings. The authors were surprised to find that although teachers and assistants 
were ascribed, and perceived themselves as having different responsibilities, observation 
showed that both groups spent most of their time in similar ways. Such differences as were 
found were in line with ascribed rules. " Teachers spent marginally more time on involve- 
ment in children's activities, supervision of children, talking to adults . . . assistants spent 
marginally more time in dealing with equipment, care and welfare, and much time more 
on housework.’ An attempt was made to look at patterns of deployment across staff 
teams of different size and composition, and the effect of varying numbers of volunteers 
on staff activities. Most involvement with children's activities was found when three, but 
no more than three, volunteers were present. The authors conclude that one can have too 
many adults in a nursery and that in general the present staffing pattern works well. 


In other words, the authors confirmed the status quo. They did not address the question 
* How can we use volunteers most effectively?’ For example, if volunteers were to be used 
to work with individual children, or to take them on short expeditions, the numbers needed 
would greatly increase. Likewise, if nursery nurses were ascribed an educational role, they 
might well assume one—indeed, the authors noticed that this happened when the teacher 
was not present. 

The report contains a great deal of interesting and useful information, e.g., on staff 
aims, their views on their initial training, and especially on how they spend their time. The 
average duration of a staff-child conversation was 35 seconds, of involvement in children's 
activities, including music &nd story sessions, 98 seconds. 


BARBARA TIZARD 


Cowen, L., and MANNION, L. (1980), Research Methods In Education. London: 
Croom Helm, рр. 328, c. #11-50, р. £5-50. 


This book has been written because the authors feel that much educational research 
is weak because of poor research methods, a view with which few could disagree. The 
question, therefore, to ask is this: would a study of this book ensure that educational research 
was improved? This reviewer, alas, feels that it would not. 


The main flaw in this book is its size and coverage. It attempts to cover a vast field 
in а small compass with the result that there is never enough of anything. А few examples 
may serve as illustration. The Nature of Science is discussed in four pages. However some 
further aspects are mentioned—causation (less than a page), explanation (a little more than 
a page) and so оп. However brilliant the synthesis, this will not do. 


However, perhaps this is unfair since in the main the book deals with research methods, 
rather than complex philosophical issues. Even here the material is too sparse, and much 
reliance is placed on Mouly, from whom certain tables are extracted. Thus in the chapter 
on correlational research there is a discussion of when correlations can be used and what 
they mean, but there is no precision. No attempt is made to discuss the problems of attenu- 
ation due to homogeneity or to the unreliability of the measures; no discussion of the 
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problems of phi (item polarities and uneven splits affecting the figures); no mention of the 
tetrachoric correlation or the range of indices developed to overcome these problems. Мо 
formulae are given and there is no discussion of the underlying mathematical assumptions. 
It is idle to imagine that a knowledge of correlations to be obtained from this could enable 
a student either to prosecute or to evaluate research. This is little more than name dropping. 
It might sound impressive in a school staff-room, but Pearson would turn in his grave. 


All the topics are dealt with as superficially ; and if a little ИШЕ may be said to be 
dangerous, this book is perhaps best marked ‘ Handle with care’ 


The point at issue is this. Should educationists be реу equipped to carry out 
adequate empirical research or not? If they should then a sound knowledge of statistics 
and research design is required. This is necessarily rigorous logically and mathematically. 
As Cattell has argued, the social sciences have too long been a refuge for the innumerate. 
Educational research even more so has fitted this description. This book will do little to 
alter this sad situation. 

Students must know about statistical methods. For this excellent books exist (Nunnally, 
Vernon, Burt, Harman, spring to mind), difficult yes, but necessary. In our view the approach 
adapted here is not sufficient. 

Clearly some readers may like the general non-mathematical approach, and feel the 
reviewer is asking the impossible. These readers will enjoy the book, which is well written 
generally and easy to read with interesting examples taken from the literature. However, 
this reviewer cannot recommend it. 


JAMES CALDERHEAD 


HARBISON, J., and HARBISON, J. (1980). А Society Under Stress—Children and Young 
People in Northern Ireland. London: Open Books, рр. vii+200, 29-95. 


It was with a sense of great excitement that I approached the review of this book-—4 
Society Under Stress. For me any text which can provide some insight into the socio- 
psychological problems affecting children and young persons in Northern Ireland will be 
of great value 

This ean provides a glimpse of the differing research strategies employed by a variety 
of educationists, social psychologists and psychologists, all of whom are tackling the same 
fundamental problem of research into the lives of young people in Northern Ireland in 
contemporary society. 

Jeremy and Joan Harbison have been able to select and edit their material from the 
proceedings of a British Psychological Society two-day conference held in September of 1978 
on children and young people living in Northern Ireland. They have presented the contri- 
butions from a multi-disciplinary group of authors in six sections. 


The Harbisons’ description of the current scene in Northern Ireland, together with a 
very perceptive review of research in Northern Ireland presented by Ken Heskin from 
Trinity College, Dublin, sets the scene for the subsequent contributions. 


There are a number of general descriptive studies of the major sociological factors 
affecting children in the Belfast area. These identify the problems of forced residential 
mobility and the restriction of occupational choice for young school leavers coming from 
different social and religious backgrounds. I found the paper by Jean White on intervention 
in an infant school in a troubled area of Belfast a particularly constructive paper. It presents 
a very neat study which was designed to test out the feasibility of providing preventive 
intervention to a group of infant school children in a highly vulnerable social situation 
in Belfast. 

The study on the similarities and differences between Catholic and Protestant boys 
was particularly revealing. When interviewed they both construed elements ‘ boys like me’ 
very similarly, yet showed considerable differences in how they construed the elements 
* Roman Catholic boys ’ and ‘ Protestant boys °. As the authors say, this finding is generalised 
to boys within the community and is a sad reflection of the social situation in Northern 
Treland. It suggests that boys who are essentially similar in their lives and aspirations, and 
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even in their delinquent behaviour, tend to view themselves as different and divided in 
Northern Ireland. It does also, of course, confirm that many of the supposed differences 
between the differing groups within Northern Ireland are more likely to be illusory and that 
their life-style has had more in common than they themselves realise. 


Martyn J. Gay 


VAN DER Kamp, L. Г. T., LANGERAK, W. F., and ре GRuuter, D. М. М. (Eds.) (1980). 
Psychometrics for Educational Debates. Chichester: Wiley, pp. vi +337, c. £18-40. 


This is the third book of a series reporting conferences on educational testing under the 
auspices of the editors. In this reviewer’s opinion, this volume is far more interesting, being 
less technical, than the previous ones and is strongly recommended to all those concerned 
with testing. 

It is divided into four sections: heredity, intelligence and education; the fairness of 
educational testing; the theory and practice of tailored testing, and the problems of open- 
ended examinations. 


In the first section there are two excellent papers by Eysenck and Jaspars. Eysenck 
presents his carefully argued case, based largely on biometric analyses, that intelligence 
tests scores are to a considerable extent hereditarily determined. Biometric analysis as 
practised by the Birmingham school of Mather and Jinks is difficult to impugn statistically 
and this clear presentation of their arguments is highly useful for educationists, who often 
appear woefully ignorant of this work. In this paper too, there is a tantalising reference to 
the work relating average evoked potentials of the EEG to intelligence test scores (correlations 
of around 0-8). If such findings can be replicated, much will be known about the biological 
determination of intelligence. 

Jaspars is primarily a social psychologist and in his paper he challenges the biometric 
approach on the grounds that the various models used to explain the variance each make 
various assumptions which in some cases are not well supported. This, of course, casts 
doubt on the results. The fit of the models to the data obviously changes with the assump- 
tions. So, without denying the influence of heredity on the population variance of IQ 
scores, he argues that the effects of social interactions cannot be ignored. This is an important 
paper which deserves careful scrutiny. These two papers alone would make the book 
worth buying, and as a basis for teaching about the question of the heritability of intelligence 
and methods to be used in the study of this problem for any variable they are excellent. 


The section on bias and fairness in test selection is particularly apposite in view of the 
recent publication of Jensen’s huge volume on Bias in Mental Testing. These papers do not 
overlap too much with Jensen because they deal with detailed points, some of which Jensen 
does not write much about—for example, latent trait theory and item characteristic curves. 
Consequently this section is well worth reading. 


The section on tailored testing is excellent. There is a surprising dearth of readable 
papers on this topic (by readable I mean less numerate than Lord and Novick) and the paper 
by Lord is particularly valuable for anyone who is considering developing a tailored test. 
Although this reviewer is somewhat sceptical of some claims for tailored testing (the 
enthusiasm of Rasch scalers, for example, overwhelms reason), there is no doubt that the 
technique could be useful for the development of tests to be used as hurdles during the 
progress of a course of education, where students can take the test as they feel ready. Lord’s 
paper seems most helpful and explains the theoretical basis of these tests with great clarity. 

These papers have been singled out not only for their quality but because they happen 
to interest this reviewer especially. Many of the other papers are equally good and this 
is highly recommended for all psychometrists. It will keep them well up to date and would 
be excellent as a teaching aid for educational testing. 


PAUL KLINE 
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SIRUCTURAL AND ORGANISATIONAL FEATURES OF 
SENSORIMOTOR INTELLIGENCE AMONG RETARDED 
INFANTS AND TODDLERS 


Ву С. J. DUNST, W. К. BRASSELL 
(Western Carolina Center Infants? Program, Morganton, North Carolina) 


AND REGINA. M. RHEINGROVER 
(University of Maryland) 


SUMMARY. According to Piaget, cognitive development is characterised by the simul- 
taneous and concurrent emergence of constructs that require the same underlying 
intellectual processes. Such syncronies in development—structure d'ensemble in 
Piaget’s terms—are a fundamental feature of his stage theory. In the present study, 
the structural features of sensorimotor intelligence were assessed among three groups of 
retarded infants and toddlers administered the seven Uzgiris and Hunt Piagetian based 
scales. Hierarchical cluster analysis (HCA)—a procedure designed to partition variables 
into optimally homogeneous groups—was performed on two measures of relationship 
(stage congruence and intercorrelations) among the responses of the subjects on these 
scales to determine the methodological utility of the statistical procedure. The results 
proved. positive, and yielded information useful for discerning the unique patterns of 
organisation of sensorimotor intelligence among the three groups of children. The 
potential utility of НСА аз an analytical technique for studying Piaget's structure 
d'ensemble stage criteria is illustrated. 


INTRODUCTION 


Ir has now been two decades since Woodward (1959) published in this Journal her 
now classic study illustrating the utility of Piaget's (1952, 1954) theory of sensorimotor 
intelligence for describing and classifying the behaviour characteristics of older 
mentally retarded children. In that investigation, Woodward specifically examined 
the extent to which the sensorimotor behaviours of retarded individuals conformed 
to the hierarchisation and structure d'ensemble stage criteria posited by Piaget 
(1960, 1973). The hierarchisation criterion simply states that the order of succession 
of stages is constant and invariant. The structure d’ensemble criterion includes the 
proposal that the mastery of different concepts which obey identical structural laws 
can be expected to be manifested concurrently (Pinard and Laurendeau, 1969)—the 
aspect of the criterion examined by Woodward. 


Woodward (1959) concluded her investigation of 147 profoundly and severely 
retarded children by stating that retarded individuals acquire sensorimotor behaviours 
in the same stage sequence posited by Piaget (1952). In the intervening years, this 
conclusion has repeatedly received empirical support. In all but a very few instances 
(e.g, Wohlhueter and Sindberg, 1975), various populations of both retarded and 
handicapped children have been found to acquire sensorimotor behaviours in an 
ordinal, stage progression manner (Decarie, 1969; Tessier, 1969/70; Silverstein et al., 
1975; Kahn, 1976; Rogers, 1977). 


The structure d'ensemble features of the sensorimotor behaviours of the children 
in Woodward's (1959) study were assessed using a stage congruence measure. This 
procedure determines, for a group of subjects, the extent to which the individuals 
comprising the sample perform at the same stage of development in different domains 
taken in pairs (e.g., object permanence and imitation). Woodward examined the 
degree of stage congruence between problem-solving and object permanence and 
between problem-solving and the most complex type of ‘ circular reaction ' (Piaget, : 
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1952) manifested. Her findings showed that stage congruence occurred 87 per cent 
and 43 per cent, respectively, for the two separate comparisons. The high per cent of 
stage congruence between problem-solving and object permanence was interpreted as 
consistent with Piaget's contentions concerning the structural features of cognitive 
development. Woodward (1959, 1979) attributed the large discrepancies between 
stage of performance on the problem-solving and circular reaction measures to the 
severe behaviour disturbances exhibited by a large majority of her sample, which she 
considered a primary factor inhibiting the manifestation of complex circular reactions. 
In nearly every case, her sample performed at a higher stage of development in the 
problem-solving domain. 

While Woodward's data appear to be the type of evidence needed to uphold 
Piaget’s contentions regarding the structural and organisational features of cognitive 
development, it is now questionable whether her data are in fact valid for this purpose. 
This is the case ror four reasons. 


First, in a recent study by Rogers (1977) of 40 profoundly retarded institutional- 
ised children, stage congruence between four sensorimotor areas (object permanence, 
Space, problem-solving and imitation) taken in pairs was found to range from as 
low as 10 per cent to only 57 per cent. In contrast to Woodward's findings of 87 
per cent stage congruence between object permanence and problem-solving, Rogers 
found correspondence only 37 per cent of the time for the same two domains. 


Second, Woodward's as well as Roger's study have a major methodological flaw. 
In these investigations, neither the chronological nor the mental ages of the children 
comprising these samples were adequately controlled. In both studies, the ages 
of the children varied considerably (approximately 8 to 16 years). This is unfortunate 
inasmuch as it has been found that the organisational features of sensorimotor 
intelligence differ as a function of developmental status (see Uzgiris, 1976). According 
to Wohlwill (1973), the methodological strategy employed by Woodward and Rogers 
presumes a deterministic model of development and ignores possible changing net- 
works of relationships among different domains of sensorimotor performance. In 
other words, it is not possible to discern periods of consolidation and the occurrences 
of decalages—central features of Piaget’s (1960, 1973) stage theory. 


Third, the developmental and theoretical meaningfulness of stage congruence as 
a measure of associative relationships has been questioned (Flavell and Wohlwill, 
1969; Pinard and Laurendeau, 1969; Flavell, 1970, 1971; Wohlwill, 1973; Uzgiris, 
1976). For example, Uzgiris (1976) points out that stage congruence between so 
variously formed concepts as, say, object permanence and vocal imitation, would have 
different meanings depending on the hypothesised structural interrelatedness of the 
two domains. 

Fourth, the stage concept upon which Woodward, Rogers, and others (see 
Uzgiris, 1976) have grounded their work has increasingly been criticised (Fischer, in 
press; Flavell, 1971, 1977; Wohlwill, 1973; Uzgiris, 1976). According to Piaget 
(1960, 1973), cognitive development is characterised by concurrent acquisition of 
constructs that require the same intellectual process. Yet synchrony has found 
to be the exception rather than the rule (see Wohlwill, 1973; Uzgiris, 1976; Flavell, 
1977; Fischer, in press, for relevant reviews). 


The asynchronous nature of cognitive development has led writers like Flavell 
(1977) to question the scientific utility of Piaget's structure d'ensemble stage criterion, 
while writers like Wohlwill (1973) have suggested that the stage concept needs to be 
examined anew. This paper reports the results of a methodological, cross-sectional 
investigation designed to discern the structural features of sensorimotor intelligence 
among retarded infants and toddlers. Two different measures of relationship (stage 
congruence and intercorrelations) among the responses of 143 children administered 
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the seven Uzgiris and Hunt (1975) Piagetian-based sensorimotor scales were subjected 
to hierarchical cluster analysis (НСА)--а procedure designed to partition variables 
into optimally homogeneous groups (Johnson, 1967; Hartigan, 1979). The nature 
of the interrelationships among the achievements was examined at three levels of 
performance (Piagetian Stages Ш, ГУ, and У). 

There were three major purposes of this study. The first was to determine the 
extent to which HCA is a useful statistical procedure for providing a ‘ new look’ at 
the structure d’ensemble stage criterion. The second was to discern the nature of the 
organisational patterns of development among the participants in the study. The 
third was to determine if there were shifts in the organisational patterns of sensori- 
motor intelligence at successive levels of development. 


METHOD 
Subjects 
The subjects were 143 retarded infants and toddlers divided into mental age 
(Bayley, 1969) groups of 3 to 8 months (N = 50), 8 to 12 months (N = 50), and 12 to 
18 months (N = 43). These mental age ranges correspond, approximately, to periods 
demarcated by sensorimotor Stages Ш, ТУ and У (Piaget, 1952). 


The diagnoses, and mean chronological ages, mental ages and developmental 
quotients of the children are presented in Tables 1 and 2 respectively. The three 


TABLE 1 
DIAGNOSES OF THE THREE GROUPS OF CHILDREN 


Mental Age Range (months) 





Diagnoses 3-8 8-12 12-18 
Down's syndrome 14 9 6 
Mental рн (due to 

unknown causes) 12 12 и 
Motor dysfunction Gncluding 

cerebral ) 11 7 

Cranial anomolies (including 

micro- and hydro-cephaly) 7 4 6 
Cultural familial retardation 

and psychosocial deprivation 6 14 13 
Totals 50 50 43 

TABLE 2 


MEAN CHRONOLOGICAL AGES, MENTAL AGES, MENTAL DEVELOP- 
MENTAL QUOTIENTS AND STANDARD DEVIATIONS (SD) FOR THE THREE 





GROUPS 
Mental Age Range (months) 
Characteristics 3-8 8-12 12-18 
Chronological age Mean 14:51 18.72 2527 
SD 9-01 10-58 8-74 
Mental age Mean 5:45 10-04 1482 
SD 125 1-23 1.59 


Developmental quotient Mean 49-19 66-54 64-70 
SD 21-76 26°66 20:27 
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groups of children were relatively homogeneous in terms of their diagnoses (72 = 8-36, 
P 0:05, df = 8), although there was a trend toward more Down's syndrome infants 
in the youngest age group while the opposite was true for children diagnosed as being 
culturally familial retarded and psycho-socially deprived. 


As would be expected, the three groups differed significantly in terms of both 
their mean chronological ages (F — 14-79, P «0-001, df — 2,140) and mean mental 
ages (Е = 552-83, P 0-001, df = 2,140). The three groups also differed in terms of 
their mean developmental quotients (F = 8-33, P<0-01, df = 2,140). On the 
average, the 3- to 8-month mental age group fell within the moderate range of retar- 
dation, whereas the other two groups fell within the mild range (Grossman, 1973). 


Procedure 
The children were administered the Uzgiris and Hunt (1975) ordinal scales of 


infant psychological development at the Western Carolina Center Infants' Program, 
Morganton, North Carolina as part of multidisciplinary assessment procedures 
provided to infants and their families at the time of entry into the infant intervention 
programme at that centre (Brassell, 1977). The Uzgiris and Hunt scales measure 
sensorimotor development in seven domains: object permanence, means-ends 
abilities, vocal imitation, gestural imitation, operational causality, spatial relationships 
and schemes for relating to objects. The attainments of each scale parallel the achieve- 
ments of sensorimotor intelligence as explicated by Piaget (1951, 1952, 1954). 


Each child received two separate sets of scores based on his or her performance. 
First, the children received a score equivalent to the ordinal rank of the highest item 
passed in each sensorimotor domain. The children thus received seven scores, one 
for each branch of development. Second, the children received a score equivalent to 
the highest stage of development achieved in each sensorimotor domain. As above, 
the children received seven scores, one for each branch of development. (The Uzgiris 
and Hunt scale items and the stages of performance which they measure were taken 
from Dunst, 1980). The two types of scores were obtained, respectively, for subse- 
quent assessment of the intercorrelations and stage congruence between all pairs of 
scales (21 possible comparisons per group for each measure). 


Data analysis was carried out in two phases. First, intercorrelational and stage 
congruence matrices were constructed based, respectively, on the ordinal and stage 
scores of the children. Thus, for each of the three groups, a correlational matrix and a 
stage congruence matrix were obtained. Pearson product-movement correlations 
were used to determine the degree of relationship between the ordinal scores on the 
scales. Stage congruence was determined through contingency table construction in a 
manner identical to that used by Woodward (1959). 


Second, each of the matrices was subjected to hierarchical cluster analysis (HCA) 
to discern the organisational features of the data. HCA is designed to summarise 
relationships among a large number of variables by partitioning the variables into 
optimally homogeneous groups or clusters (Johnson, 1967; Hartigan, 1979). With 
regards to the matrices obtained in this study, the analysis begins with each scale 
representing a separate cluster and then proceeds in a stepwise manner to combine 
scalés which are optimally connected. The analysis proceeds until all the scales form 
a single cluster. The metrics used as measures of association (distance) were, respec- 
tively, the magnitude of the correlation coefficients and the percentage of stage con- 
gruence for the intercorrelational and stage congruence analyses. Clusters were 
formed using the maximum distance method (Hartigan, 1979). In this method, the 
similarity between two clusters is defined as the smallest distance between them, i.e., 
the smallest magnitude of the metric between the scales comprising the cluster. This 
procedure yields clusters that are optimally compact (Johnson, 1967). 
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RESULTS 


Performance scores and stage placements 

The mean performance scores and mean stage placement scores of the children 
are presented in Tables 3 and 4 respectively. The levels of sensorimotor development 
achieved by all three groups differed significantly on both measures for all seven scales. 
As would be expected, the higher the mental age of the children, the higher the level 
of sensorimotor development achieved. These results simply indicate that the par- 
ticular types of sensorimotor abilities the children were capable of performing differed 
at the three mental age levels. 














TABLE 3 

MEAN PERFORMANCE SCORES AND STANDARD DEVIATIONS (SD) ом THE UZGIRIS AND 

Hunt SCALES 

Mental Age Range (months) 

3-8 8-12 12-18 

Scales Mean SD Mean SD Mean SD  Е(2,140) 
Object permanence (14) 3-58 167 5:96 171 944 277 92-67% 
Means-ends abilities (13 404 252 774 240 9-70 118 983-47* 
Vocal imitation (9) 1-96 126 2-68 1-48 4-49 171 35-13* 
Gestural imitation (9) 1-24 1:35 3-00 1.54 5-42 236 64-43* 
Operational causality (7) 2:46 093 3:20 O61 5:12 1450  7823* 
Spatial relationships (11) 4-58 185 668 145 9-14 1:93 78-56* 
Object schemes (10) 378 214 606 156 781 107  6819* 
*P «0-001 


+ Numbers in parentheses indicate the number of items included on each scale. 














TABLE 4 

MEAN STAGE PLACEMENT SCORES AND STANDARD Deviations (SD) ом THE UZGRIS 

AND HUNT 

Mental Age Range (months) 

3-8 8-12 12-18 
Scales Mean SD Mean SD Mean SD НО) 
Object permanence 3-12 096 444 073 540 0-49 86:37* 
Means-ends abilities 3-26 0-85 448 0-74 5:02 0:34 76:55* 
Vocal imitation 2410 0:54 2:36 0:78 347 093 50-93* 
Gestural imitation 2-06 1-02 3-24 0:82 4-21 191 66-10* 
Operational causality 294 055 328 050 465 0:97 68-23* 
Spatial relationships 3-00 0-81 3-96 0-60 4:98 0-86 84-44% 
Object schemes 288 0:87 372 O61 460 0:66 73:40» 


тр «0.001 


Intercorrelational and stage congruence matrices 

Table 5 presents the results of the intercorrelational analyses. Among the 
youngest mental age group, 15 of the 21 correlations were positive, of moderate to 
substantial magnitude, and significant beyond the 0-01 level. Only the achievements 
on the vocal imitation scale showed no significant relationship to the achievements 
on the other six scales. The mean of all 21 correlation coefficients was 0:41 
(SD = 0:19). Thus, on the average, 17 per cent of the variance of the scores on any 
one scale is accounted for by the variance of the scores on another scale. 
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Among the 8- to 12-month mental age group, 10 of the 21 correlations were 
positive, of moderate magnitude, and significant beyond the 0-05 level. With the 
exception of a moderate correlation between the vocal imitation and means-ends 
scales, neither the achievements on the vocal imitation nor operational causality scales 
showed any significant correlations with the achievements on the remaining five scales. 
The mean of all 21 correlation coefficients was 0-30 (SD = 0:14). This indicates that, 
on the average, only 9 per cent of the variance of the scores on any one scale is 
accounted for by the variance of the scores on another scale. 


Of the 21 correlations between the scales for the 12- to 18-month mental age 
group, 11 were positive, of moderate magnitude, and significant beyond the 0-05 level. 
The mean of all 21 correlations was 0-28 (SD — 0-19), thus indicating, on the average, 
that 8 per cent of the variance in the scores on any one scale is accounted for by the 
variance of the scores on another scale. 

The matrices of per cent of stage congruence between all 21 pair-wise com- 
binations of the seven sensorimotor scales are presented in Table 6. The mean per 
cent of stage congruence for all 21 comparisons were, respectively, 39-05 (SD = 14-79), 
28:19 (SD = 16-92), and 30:95 (SD = 15-26) for the 3 to 8, 8 to 12, and 12- to 18- 
month mental age groups. 


Hierarchical cluster analyses 

The results of the cluster analysis are shown in Figure 1. То make both the 
intercorrelational and stage congruence data directly comparable, the metrics used 
as measures of similarity were converted to a scale from 0 to 100, where a correlation 
coefficient or per cent of stage congruence of 0-0 is recoded to zero (minimum simi- 
larity) and recoded values of 100 indicate maximum similarity. The left to right 
presentation of clusters in each tree diagram depicts the strongest to weakest clustering 
networks. 

Table 7 summarises the results of the cluster analysis. A comparison of the 
clustering networks obtained from the intercorrelational and stage congruence 
analyses shows that both measures result in nearly identical structural networks for 
the 8- to 12-month menta] age group, relatively similar organisational patterns for the 
3- to 8-month mental age group, but quite divergent patterns of organisation for the 
12- to 18-month mental age group. 

Examination of the intercorrelational clustering networks across all three mental 
age levels shows minor organisational shifts from the early to middle mental age 
ranges, but marked shifts from both the early to middle and the early to oldest mental 
age levels. Patterns of shifts are most evident for the operational causality, spatial 
relationships, schemes and gestural imitation scales. The most striking finding is that 
vocal imitation forms a separate cluster at all three age levels. 





TABLE 7 
MAJOR CLUSTERING NETWORKS AMONG THE SEVEN 701818 AND HUNT SCALES 
Mental Age Range (months) 
Type of analysis 3-8 8-12 12-18 
Intercorrelational [ОР, ME] ОР, ME, SO, SR] IGI, 
clusters [OC, SR, SO, Сп (СІ, ОС] [OC, ES OP, ME] 
(V [VI] IV 
Баре ов congruence SR, SO] ISR, SO] [ME, SO, OC] 
E, OC, ОР] [OP, ME] [OP, SR] 


Ут Gi] IGI, OC] мІ, GI] 
[VI] 
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FIGURE 1 
CLUSTERING NETWORKS AMONG THE SEVEN Uzairis AND Номт SENSORIMOTOR SCALES 
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MEASURE OF SIMILARITY 


Note: OP—Object permanence; ME—Means-ends abilities; VI-—Vocal imitation; GlI—Gestural 
imitation; OC—Operational causality; Shor relationships; and SO—Schemes for relating 
to objects. 


Except for minor variations involving the operational causality and gestural 
imitation scales, the stage congruence clustering networks for the early and middle 
mental age groups are quite alike. However, the clustering network for the oldest 
mental age group differs considerably from those of the early and middle mental age 
groups. Shifts in the patterns of organisation occur for all but the vocal and gestural 
imitation scales. While not as clearcut as the intercorrelational results, the findings 
of the stage congruence analysis show that vocal imitation is minimally related to the 
other six scales. It does however combine to form a weak cluster with gestural imita- 
tion at both the youngest and oldest mental age levels. 


DISCUSSION 


The data reported in this paper strongly indicate that cognitive development, at 
least in terms of the sensorimotor period, is asynchronous in nature. Although 
statistically significant in half the comparisons, the magnitudes of the correlation 
coefficients were found to be much lower, in general, than one would expect given 
Piaget’s contentions regarding the structural features of cognitive development. 
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Piaget's contentions are especially thrown into doubt in light of the results for the 
stage congruence analysis. Using even 50 per cent stage congruence as a minimal 
cut-off point for support of the structure d'ensemble stage criterion, only 12 or 19 
per cent of the 63 pair-wise comparisons exceed this level! The results of both the 
intercorrelational and stage congruence analyses are quite consistent with findings 
reported elsewhere for both non-retarded (King and Seegmiller, 1973; Uzgiris, 
1973, 1976; Kopp et al., 1974) and retarded (Kahn, 1976; Rogers, 1977) children. 


Despite the fact that development during the sensorimotor period appears to be 
generally asynchronous in nature, it would be misleading indeed to suggest that there 
are not some unique structural patterns in the early cognitive growth of retarded 
children. The results of the cluster analysis illustrate this quite well. Before discussing 
the value of these findings, several important points need to be addressed regarding 
the correlational and stage congruence measures made in this investigation. 


Correlational and stage congruence measures are not necessarily analogous 
measures. While complete stage congruence between any two scales would result in a 
high correlation between the achievements of the two branches of development, it 
does not necessarily follow that lack of stage congruence would yield a low correlation. 
In fact, it would be possible to have complete lack of stage congruence by a substantial 
correlation between the achievements of the two scales. In this case, each child’s 
development on one scale would be relatively advanced compared to performance 
on another scale, but the children comprising the sample would nonetheless show 
near complete one-to-one correspondence in terms of their ordinal ranks in the 
respective branch of development. Correlational analyses yield information regarding 
covariation among variables but which are independent of actual levels of perfor- 
mance, and stage congruence analyses yield information regarding developmental 
synchronies (i.e., same stage performance) but which are independent of amount of 
variation where stage congruence is not found. Thus, each type of data could poten- 
tially contribute different information regarding the structural features of cognitive 
development. With these points in mind, several things can be said regarding the 
cluster analysis to illustrate the methodological and theoretical utility of the type of 
information obtained. 


Let us begin with the most obvious finding—the independent nature of the 
development of vocal imitation abilities. The stage congruence analysis indicates that, 
in general, stage of performance in vocal imitation does not correspond to level of 
performance in the other domains (in nearly every case vocal imitation abilities were 
lower). The intercorrelational analysis indicates that there is little covariation in 
performance between vocal imitation and that of the other scales. From the stage 
congruence findings we can conclude that there are clear decalages in development, 
and from the correlational results we can conclude that such decalages are not sys- 
tematic. Thus, one could hypothesise that the cognitive processes involved in the 
acquisition of vocal imitation are quite different from those for other sensorimotor 
abilities—at least among mentally retarded children. 


To illustrate further, consider the findings regarding gestural imitation abilities 
for the 12- to 18-month mental age group (Figure 1). Whereas the stage congruence 
analysis indicates that level of gestural imitation is generally unrelated to level of 
performance in the other domains, the intercorrelational analysis shows, despite the 
decalage in development, that covariation between the achievements on the gestural 
imitation and spatial relationship scales is systematic. This could possibly indicate a 
functional relationship between the achievements of these two domains at this par- 
ticular level of development, which could then be examined in more detail in a subse- 
quent investigation. 


Besides discerning asynchronies in cognitive development, HCA also appears to 
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be useful for identifying the consistencies in tbe development of sensorimotor intelli- 
gence. For example, the HCA analysis shows that performance on the object 
permanence and means-ends abilities scales covary at each of the three mental age 
levels. Examination of the intercorrelational analyses reveals that these branches of 
development, at all three age levels, are included in the same cluster. Thus, both scales 
apparently involve similar underlying cognitive processes. 


Overall, the findings of the present study offer evidence in support of a model of 
cognitive development that is characterised by repetitive phases of disequilibration and 
stabilisation between structurally related cognitive domains (see Wohlwill, 1973). 
For example, the results of the intercorrelational analysis show that the achievements 
on the operational causality scale covary with performance on several other scales at 
the 3- to 8-month level, show relatively no covariation at the 8- to 12- month level and 
again covary with performance on several scales at the 12- to 18-month level (see 
Figure 1 and Table 7). Similar patterns are also evident for other domains as well 
(e.g., spatial relationships). These particular patterns of development are characteris- 
tic of periods of consolidation and achievement/preparation as hypothesised by 
Piaget (1960, 1973). 


While only several examples are given here to illustrate the utility of HCA, we 
believe it is clear that this methodological strategy offers a new approach to the study 
of Piaget's structure d'ensemble stage criterion. А more complete description would, 
of course, involve fuller exploitation of the data. In conclusion, HCA appears to bea 
powerful analytical procedure which heretofore has not been utilised as a technique 
for studying the structural and organisational features of cognitive development. Its 
use for this purpose is highly recommended in future studies. 
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THE EEFECTS OF CLASSROOM SPATIAL 
ORGANISATION ON FOUR- AND FIVE-YEAR-OLD 
CHILDREN'S LEARNING 


з Bv B. С. NASH 
(Ontario Institute for Studies in Education, Canada) 


SuMMARY. In a three-year cumulative study, children's learning in 19 randomly 
arranged classrooms (R) was compared with that in 19 classrooms (S) in which space 
was deliberately arranged to promote learning. Time scheduling, equipment, materials 
and teacher-child communication patterns were similar for all classrooms. Over 
250 four-year-olds and 250 five-year-olds were observed in each setting (М = 1072). 
Creative productivity and skills, generalisation of number concepts, variety of oral 
language use and utilisation of listening and pre-reading materials were significantly 
better for both four-year-old and five-year-old children in the (S) classrooms. 


INTRODUCTION 


Topay few educators of young children espouse the idea that their pupils should be 
seen and not heard, much less that they should be so while sitting passively in rows of 
desks facing the teacher. Our kindergarten classes are usually activity-centred, with 
children spending extended periods of time learning through independent play. Мої 
surprisingly, some teachers wonder how they can be sure that the children are learning 
and, more importantly, what the teacher can do to enhance learning through play. 

In most areas of our lives we organise the space around us according to what we 
intend to do in it (Steele, 1973). Kitchens have work surfaces and studies have desks 
and bookshelves. Senior schools have laboratories and rooms for technical subjects. 
Perhaps it is more difficult to conceive of spaces organised specifically to enhance a 
young child’s learning; or perhaps this is part of the broader problem of the difficulty 
of creating teaching strategies to increase the likelihood of specific learning outcomes. 
Whatever the reason, judging from published accounts, little serious attention has been 
paid to describing effects of classroom space planning. 

Silberman (1970) suggested that the reduction in structuring of timetabling could 
be mediated by some reflective structuring of space. In Montessori (1964) pro- 
grammes, the intent of individualising learning is reinforced by the provision of mats 
for each child. Pfluger and Zola (1969) described how a kindergarten classroom was 
emptied and the furniture stacked outside the door, with pieces of equipment and 
learning materials only put back into the room at the request of the child. Not 
surprisingly, many of the available materials remained unused. Only a few learning 
outcomes were assessed. 

In an attempt to raise the awareness of teachers to the impact of the whole gamut 
of teaching activities upon young children, the present author described the Learning 
Environment approach in nursery and kindergarten education (Nash, 1979). This 
approach analyses classroom practice into four components: the teachers’ and chil- 
dren’s use of Time, the arrangement of classroom Space, the available equipment and 
quality of learning materials (Things), and the interactions among People. An 
adequate teaching strategy will incorporate appropriate attention to each component. 

Over the past seven years, numerous teachers have implemented the approach to 
varying degrees. This has enabled the author to conduct extensive evaluation studies 
of the effects of different kinds of space planning while time scheduling, available 
materials and teacher-pupil interaction were similar across programmes. 

The studies reported here resulted from cumulative observation of programmes and 
children over a three-year period. The planning of classroom space to help children 
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to learn specific skills and concepts, including positive self-concept, is part of a broader 
approach to curriculum. It is assumed that successful teaching strategies, among 
which would be the provision of appropriate learning spaces, begin with decisions 
about objectives for learning, continue with an analysis of the objectives to the point 
where teaching strategies can be created to achieve them (Robinson, 1979). The 
clarity of analysis would also lead to accurate observation of the child's learning so 
that decisions can be made about the effectiveness of the teaching strategy. 


Using the notion that specific objectives can be set within each aspect of the 
programme, and that spatial location of learning centres can affect the extent to which 
these objectives are reached, this investigation compares the outcomes of programmes 
where space is planned around conceptions of the child's learning needs (with the 
intention of promoting specific learning outcomes), with those where it is planned 
around other considerations or not planned at all. The planned arrangement involves 
the grouping together of learning centres with objectives which can be mutually 
enhancing. 


METHOD 


The educational settings and the subjects 

Programmes and children in 19 randomly arranged classrooms (R) and 19 
spatially planned classrooms (S) were observed within a three-year period. Each 
classroom was used by four-year-old children in the mornings, and by five-year-old 
children in the afternoons. The class sizes ranged from 15 to 23, and were in public 
schools in both urban and rural settings in Ontario. All of the classes had a mixture 
of social, economic and ethnic backgrounds. Мо child was observed for more than 
one year. So a school with both a four- and a five-year-old programme would be 
observed in Year 1 and Year 3, not in Years land 2. Table 1 shows the distribution 
of subjects. 

TABLE 1 


DISTRIBUTION OF SUBJECTS BETWEEN PROGRAMMES BY AGE AND SEX FOR EACH YEAR OF THE STUDY 























Age 4 years 5 years 
Sex Boys Girls Boys Girls 
Year Space-Planned Random Planned Random Planned Random Planned Random Total 
1 43 40 46 44 40 45 44 44 346 
2 43 44 47 49 43 43 46 48 363 
3 40 41 45 48 44 45 51 49 363 
Totals 126 125 138 141 127 133 141 141 1072 





Spatially planned (S) and spatially random (К) classrooms 

The overall objective was 2 provide rich programmes for the development of the 
child’s potential in all areas. In the classrooms this was specified by the grouping of 
various types of activities. So five major sub-areas of programme can be seen to 
require five sub-divisions of the classroom space. Four areas, for Oral Language, 
Number and Science concept development, Fine Motor and Visual and Auditory 
Readiness, and for Creative Skills and Ideas, can be accommodated within the normal 
classroom space. An adjacent area, either indoors or out, with gross motor equip- 
ment, motivating props, and, ideally, music, is available for the children's use at all 
times. Figure 1 sbows an example of a classroom and adjacent space arranged for the 
five mini-programmes. "Teachers who used the rational space plan did so as a resuit 
of workshops on the techniques. 
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In the R classrooms, all of the same pieces of equipment, and similar quantities 
and types of learning materials, were present. However their placement was 
decided on ‘ housekeeping ' criteria less relevant to learning objectives, such as noise 
(work bench placed outside in the corridor), availability of water (water play, paint, 
and aquarium close together), reduction of possible mess (water play out by the 
toilet, sand only outside), tables required for lunch (all activities requiring tables placed 
in one corner of a room). Sometimes no criteria could be given by the teacher, so 
that a truly random arrangement was made. 


Observation and recording of spatial and non-spatial elements of programmes 

Observation schedules were devised in 1973 which would describe kindergarten 
teaching strategies according to the treatment of Time, interactions with People, 
Materials and Space. Over a seven-year period the schedules were field-tested in 75 
classrooms. Inter-observer reliability between 15 pairs of observers ranged from 0-89 
to 0-91 for the people interaction items, and from 0-92 to 0-96 for the time dimension. 
Records for five pairs of observers over a four-year period showed no significant 
changes after the initial four-month training period. These levels of reliability would 
be expected since the items are very specific, and where subjective judgments are 
demanded criteria for scoring are well-defined. There are 24 observations of aspects of 
time treatment, 20 of spatial arrangement and function, 22 of people interactions 
including time sampling of specific behaviours, and detailed recording of all classroom 
equipment and learning materials and of child interactions with them. 


Non-spatial elements of the programmes 

The time-tables for all of the rooms were essentially similar, all of them showing 
75 minutes of free play in a single time block. All of the teachers expected the children 
to register their choice of activities on a planning board. Beyond this, task orienta- 
tion was not rewarded (see Nash, 1979a). 


Similar patterns of teacher-child interaction in terms of time spent with indi- 
viduals, small groups, and large groups were noted. Equal numbers of teachers in 
В and S classrooms were classified as ‘ directors’ or as ‘ facilitators’, and initial 
analysis was done with this information coded in. When directiveness and facilitation 
were found to be effective only through reinforcement of other teaching strategies, this 
factor was dropped in favour of the simpler analysis. 


All classrooms had all of the pieces of equipment shown in Figure 1. Similar 
commercial learning materials were available in all classrooms. АП but five rooms 
had adequate non-commercial materials, and the author supplied what was needed to 
bring these five up to the standard of the rest. 


Learning outcome observations 

In September and October of each year, each child was observed while working 
at two or more learning centres. The measures used were derived from the specific 
objectives of the programme. These were pre-defined and agreed to by the 27 teachers 
whose classes were observed, and are described along with the programme outcomes. 
In April, May and June, the children were all observed again using the same criteria. 
Observation was by three trained research assistants and by the author. Results are 
reported only where inter-observer reliability reached 0-89. The criteria are described 
with the results. 


RESULTS 
Outcomes of creative activities 
General objectives for children's learning were that they would (a) develop new 
skills to express their ideas in art work, (b) grow away from stereotyped use of materials, 
and (c) come to think of themselves as creative persons. 
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It was hypothesised that when the materials for various creative activities are 
placed randomly around the classroom, the children understand, perhaps rightly, 
that they are not meant to combine the materials. They then lack the incentive to 
develop skills at combining materials. 


In S classrooms, all of the centres at which children work to create products 
(paint tables and easels, collage table, clay, woodwork bench, and sewing materials) 
are grouped together. Thus the S classroom presents the idea that the materials could 
be combined, if only by capitalising on the young child's distractibility. As he or she 
tries to put the materials together, he or she is motivated to learn how to use a variety 
of methods. Аз children succeed they may come to see themselves as capable of 
creating things. 

The number of moves made by each child in creating a product was recorded. 
A ‘move’ in painting would be the application of a single colour to one area of the 
paper, rock or piece of other material. In collage, it would be the fixing of a single 
item to the background, in woodwork the drilling of a hole, the use of a nail, or the 
glueing together of two pieces of material. From Table 2 it can be seen that the 
average number of moves made on a product by children in classrooms with creative 
activity centres grouped together was at least double that for children in other class- 
rooms. 


From Table 2, it will also be seen that in the S classrooms children combined 
more materials to achieve effects, frequently using rocks, wood, or box-sculptures, 
painting them, decorating them with collage materials, and threaded beadwork or 
playdough. In R classrooms, children painted on paper, and rarely combined any 
materials, except when they painted or pasted box sculptures or objects made in 
wood. Here such activities were usually set up for one ог two days by the teacher, 
rather than having a number of combinable materials grouped together all year. 


Not surprisingly, the children in the planned classrooms, having more practice at, 
and opportunities in, developing skills at combining materials, became more highly 
skilled at using glues, nails, screws, needle and yarn, string, ribbons and raffia as 
* connectors ' (P «0-001). Highly skilled children at five years would be capable of 
choosing connectors which would work; of choosing the screwdriver to match the 
type of screw; of choosing the right-sized needle and threading it; of selecting the 
most effective glue from among three types. At four years, expected skills included 
choice of the better of two glues and use of the right amount of glue, using a hammer 
and large-headed nails, and a screwdriver when given help in drilling a hole first, and 
using needle and thread in a variety of situations. The ultimate test of ability to 
choose and use connectors adequately was whether or not the pieces joined by it 
stayed together or fell apart when moved. 


A different skill of choosing and using the best medium, particularly paint—thick 
poster paint for easel work, latex wall paint for wooden models, paint mixed with 
detergents to cover glossy boxes—was also more highly developed in the planned 
classrooms (P «0-001). So, children in the planned classrooms were observed solving 
for themselves problems about how to manipulate the materials to achieve some 
specific effect. 


In the course of recorded informal conversations about themselves and their 
creative products, each child was asked if he or she thought of him or herself as a 
person who could make things. Did he or she generally manage to make the things he 
or she wanted to? Were they pleased with what they made? Was it difficult to be an 
artist? The responses to these questions show us that children in classrooms with 
creative activity centres grouped together were more likely to see themselves capable 
of making things than children in the other classrooms (P « 0-001). 
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Outcomes of science and number activities 

Among the objectives for number and science learning were that children would 
(a) be able to understand that quantity of liquids and other continuous substances is 
not changed by the shape of the containers holding it, (Б) be able to generalise this 
understanding across substances, and (c) would try to use number concepts and/or 
counting in their work at science activities. 


In S classrooms, the science and number area includes sand tables, water table, 
science activity table, small pets, aquarium, and plants. 


Clear plastic bottles and jars of matched capacity and funnels were available in 
allclassrooms. Using these the observers played with children at the water, and sand 
centres, posing the standard Piagetian questions such as ,“ Can you find me a bottle 
which will hold the same amount of water as this yoghurt рої?" " Now, do you think 
there is the same amount of water in the yoghurt pot as in the bottle?” “ Why do 
you think that?" Table 3 shows the expected developmental pattern of increasing 
proportions of " conservation ' responses with age in В classrooms. In 5 classrooms, 
conservation is achieved by greater numbers of children earlier. This can be under- 
stood in terms of greater motivation to manipulate the materials. An unexpected 
finding was that some subjects understand number constancy only with respect to a 
single material, either sand or water. This is seen in R classrooms where a child is 
less likely to use the same containers to test the relationships with each material in turn. 


In R classrooms there were few instances of children attempting to count or 
measure, however primitively, when investigating phenomena such as flotation, 
magnetic attraction, magnification, seedling growth. In S classrooms, significantly 
greater number of attempts to use number in science activities were observed. 


Outcomes of * readiness’ activities 

* Readiness ' objectives included that pupils would (а) develop the ability to 
produce and maintain increasingly complex shape, colour and number patterns, 
generalising across materials such as beads, pegboards and unit blocks, (b) utilise 
combinations of two- and three-dimensional materials to tell stories, solve number 
problems or discuss real life events, and (c) choose to spend enough time at readiness 
activities to learn from them. 


Table 4 shows that these objectives were achieved more readily in the S classrooms 
in which activities requiring a more sheltered space for completion were placed 
together. Two-dimensional paper-and-pencil and print materials could be supple- 
mented by small blocks, table toys such as small dolis, doll houses, vehicles, animals. 
Considerably more time was spent by children at independent readiness activities 
during the course of the morning when a special area was provided. Conversations 
with the children elicited feelings of frustration about R classrooms: “ How can I 
remember which bead goes next when Jody keeps knocking the table with the trike? " 
“This is the third time I’ve started this (puzzle). It keeps getting joggled away 
(by a child playing with a fire engine under the table). " “J can't understand this 
(ditto sheet).” In the latter case, small blocks placed on top of the dots by the author 
solved the problem. In S classrooms, children themselves more often found three- 
dimensional objects to manipulate to elucidate two-dimensional tasks. 


Outcomes of oral language activities 

In the S classrooms, an oral language area is provided with the housekeeping and 
other role play activities, dress-up clothes and mirror, puppet theatre, with large 
block centre adjacent so that oral language can be extended. 


Halliday's (1973) instrumental model of language was used to assess the oral 
language production of pupils at play in both types of classroom. For each of the 168 
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four-year-olds and 187 five-year-olds, two periods of 10 minutes of imaginative or role 
play, and two 10-minute periods of play at other centres were recorded and analysed 
giving 40 minutes in all for each child. 


Table 5 shows both developmental trends towards the use of language for a 
variety of purposes. Halliday described seven instrumental uses of language in 
sufficient detail for these to be identified. Ога] language development is seen as central 
to the kindergarten programme as a precedent to symbolise language а year or so 
later. In S classrooms, significantly more children used language in a variety of ways. 


The grouping of role play centres was also to facilitate the child's understanding 
and handling of conflict situations better, with the objective of reducing the incidence 
of conflict. Observations in both types of settings in Apri] and May showed signifi- 
cantly lower incidence of conflicts which the children could not resolve themselves in 


S classrooms. 
DISCUSSION AND CONCLUSIONS 


Steele suggested that spatial arrangements in offices, reception areas, airports tell 
us what kind of behaviour is expected of us as we work in, visit, or arrive at them. Бог 
the child the spatial organisation in the classroom spells out our expectations for his 
or her behaviour. What the teacher does with space can thus be part of a teaching 
strategy. 

This article has reported а sample of learning outcome or learning behaviour 
comparisons between children in rooms planned to achieve complex objectives and 
children in rooms planned on the basis of the criteria of needs of the teacher. 

The underlying principle is that of capitalising on the young child's distractibility 
to transform it into generalising behaviour. The proximity of learning centres with 
similar objectives for the development of skills or concepts tells the child that he is 
permitted to transfer skills. Conversations with several hundred children showed that 
the physical separation of activities provides the opposite message for the child. 
“ Why don't you paint your [wooden] fire engine?” “ We aren't meant to paint the 
things we make." (The teacher actually had no objections and said she wondered 
why they didn't paint their creations.) The grouping of centres also prolongs the 
duration of a child's presence in areas with readiness activities, especially when the 
areas are well-defined. 

Teachers who separate sand and water play to avoid mess, or who place a con- 
struction bench outside the door, clearly subscribe to the idea of interaction between 
learning activities for young children, albeit in a negative sense. This investigation 
strongly suggests that the arrangement of classroom space can enhance or reduce 
specific learning outcomes. The differences in learning outcomes were great, but they 
must be considered as products of two approaches producing positive effects. The 
planned spatial arrangements probably enhanced the observed learning outcomes, 
but they did so mainly by removing impediments to learning. The settings where 
thought was not given to harmonising space with outcomes were no less powerful in 
producing learning, but the learnings produced were not intended by theteachers. The 
list of desired learning outcomes was endorsed by all teachers participating in the 
study. Settings planned using criteria other than the advancement of pupil learning 
often produce distractible behaviour, make it difficult for a child to complete a task 
without interruption, or unlikely that he will progress to more complex activities. 


At the pre-operational level, criteria for space planning would be the reduction of 
distractible behaviour or the use of distractibility to provide higher probabilities of 
generalisation and combined activities. As we move up through the educational 
system, the need for planning or assigning appropriate spaces for various learning 
persists. Yet the author's experience with teachers at all Jevels shows it to be very 
poorly incorporated as a teaching strategy. 
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Perhaps the best summary of the effects of rational space planning in kinder- 
garten was given by a four-year-old introducing а new child to his classroom: “ Over 
here we make lots of things, and here, we find things out. This is where we pretend, 
and build, and be as grown-up as anything. And this is а nice quiet place where the 
puzzles and books are—you can't ride a trike or play ball or bring sand in here. This 
is a good place to be." 


Perhaps the most disturbing finding is that the author has seen little evidence of 
attention to planning classrooms to facilitate active learning by young children in 
Britain, Canada, the US, West Germany or Israel in recent years. 
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Summary. This paper reports a comparative study of the reading lessons of deaf 
and hearing children. Deaf and hearing children were video-taped reading texts in 
the classroom in normal reading lessons. These recordings were analysed to identify 
the reasons why children stopped or were stopped in reading, the time spent in the 
lesson and reading rates. The study shows that the reading lesson of deaf children 
tends to be used for a variety of purposes and shows marked differences from that of 
hearing children learning from the same books. However, it also shows that deaf 
children develop faster reading speeds as they come to tackle more difficult texts in a 
manner similar to the hearing. The study also indicates differences in reading develop- 
ment in the deaf associated with teaching methods, but this Jatter result demands further 
investigation. These results are discussed in relation to reading retardation in the deaf. 


INTRODUCTION 


THs paper is concerned with a detailed analysis of the reading lessons of deaf and 
hearing children. Its aims are two-fold. A theoretical aim is to investigate any im- 
portant links between "ће reading achievements of pre-lingually, severely/profoundly 
deaf children and the experience that they have and the problems their teachers 
face in the reading lesson. 

The second aim is more practical. This study represents an initia] investigation of 
similarities and differences in teaching strategies across deaf and hearing children 
and, peripherally, across different teachers of the deaf. The eventual objective is to 
determine whether any important connections exist between different teaching methods 
and reading progress in the deaf child. This aspect of the study parallels other work 
that is designed to investigate the effects of teaching techniques generally on the 
linguistic development of hearing-impaired children (Wood, in press; Breslaw et al., 
in preparation; Wood et al., in preparation). 

In a recent comprehensive survey of the linguistic achievements of deaf school- 
leavers in the United Kingdom, Conrad (1979) reports that the median reading age of 
the deaf children leaving schools in England and Wales is about 7-8 years of age. 
This low achievement is paralleled by measures of the same population’s speech 
intelligibility and lip-reading skills. The general profile of low linguistic achieve- 
ments has also been found in studies performed in other parts of the world (e.g., 
VandenBerg, 1971; Wilbur and Quigley, 1975). 

Teacher-child ratios in schools and units for the deaf and hearing-impaired are 
generally good, and deaf children almost universally receive auditory amplification 
from diagnosis of the handicap. Educators concerned with the problems of deafness 
have spent a great deal of time and effort in designing numerous schemes and methods 
for teaching the deaf child to read. And yet, in spite of this massive injection of 
economic and human resources, it seems that the achievements of the deaf remain 
uniformly low. The present investigation was designed to see how, if at all, these low 
achievements of the deaf child are reflected in his reading lessons. What goes wrong in 
attempts to teach the child to read? What problems and difficulties figure so strongly 
in the deaf child’s attempts to learn how to read that all efforts to help him learn seem 
to meet with only marginal success? 
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These are the general questions underlying the design of the present study. 


METHOD 

Design and procedure 

This study is comparative in nature. One essential feature of its design rests on 
the matching procedure employed in the investigation. Since the primary aim is to 
identify the demands placed upon the deaf child in reading lessons and to uncover any 
differences and similarities between these demands and those made of hearing children, 
any matching procedure had to rest on the existing practices used in schools for 
deciding what text a particular child should read. Consequently, an ‘ ecological 
matching ' procedure was developed. 


Teachers of the deaf were asked to teach individual children from their class in a 
normal teaching session, using the primer they would normally have used and reading 
that part of the text which they were due to read. As in schools for the hearing, a 
common situation used in teaching reading to the deaf child is one in which the child 
works individually with his teacher through a series of reading primers. This situation 
formed the focus of the study. Individual sessions were video-taped in the classroom 
for subsequent analysis. The matching procedure was as follows. A number of 
primary schools for hearing children were contacted, and the primers used in teaching 
reading ascertained. Schools were then selected that used one of the primer series 
being employed in the deaf classes. Finally, teachers of hearing children were video- 
taped during the naturally occurring teaching sessions in which their pupils, using the 
the same texts as the deaf pupils, arrived at and read those same texts. In this way, 
individual matches were obtained for each of the deaf children. 


The video-taped recordings were subsequently transcribed. The major dependent 
variables of interest were: 


(1) Frequency of stops initiated by both teacher and child. 

(2) Reason for stops derived from the teacher's interpretation of the child's 
ailure. А 

(3) Number of words actually read. 

(4) Time spent in reading. 


The reasons for stops were analysed according to the following scheme: 


(A) Phoneme—grapheme stops. The teacher focused on the relationships 
between speech patterns and written patterns. 

(B) Lexical stops. The teacher checked the child's knowledge of the meaning of 
а word or phrase and/or explained/demonstrated its meaning. 

(C) Articulation stops. The teacher asked the child to repeat а word more 
correctly or with different intonation. 

(D) Reinforcement. Pauses or stops where the teacher praised the child for his 
reading. 


Subjects 

Subjects were 14 deaf and 14 bearing children with their teachers. The deaf 
children had a mean hearing loss of 90 db averaged over five frequencies in the better 
ear (SD = 15). Children ranged in non-verbal IQ from 82 to 131 and in age from 
6 years 6 months to 10 years 3 months. The average age of the hearing children was 
6 years 7 months and ranged from 4-11 to 9 years 1 month. 


All the deaf children were pre-lingually deaf and had no additional handicaps. 
One child was of deaf parents, the remaining 13 had hearing parents. They were 
drawn (seven each) from two schools for the deaf. 
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RESULTS AND DISCUSSIONS 
Stops in reading 
Ап examination of the frequency of stops in reading showed a clear difference 
between the hearing and deaflessons. Deaf children stopped reading or were stopped 
by teachers significantly more frequently than the hearing children (mean 17-6 stops 
for the deaf and 8-9 for the hearing, Е = 16-53, df = 1,13, P« 0-001). More import- 
ant, however, was a marked difference in the reasons for stops—as identified by the 
teacher's interpretation of the child's difficulty. Deaf children were stopped for a 
mean 2:9 different reasons, hearing children for a mean 1*5 reasons (sign test, x = 1, 
= 14, P«0-01). In other words, teachers of the deaf interpreted their children as 
having a greater range of difficulties than tbe teachers of hearing children did with 
their children. Put another way, the goal structure of the teachers of the deaf was 
more elaborate than that for the teachers of the hearing children. 


A. closer examination of the reasons for stopping revealed that the frequency of 
stops for reading failure (lexical-grapheme stops) was not significantly different for 
the two samples, although the deaf children encountered more stops of this type 
(P <0:133, sign test). There was also a tendency for deaf children to stop ог be stopped 
more frequently for articulation failure (P« 0-06, sign test). However, the most 
marked difference between the two samples was the frequency of stops for meaning 
failure (lexical stops). The deaf children's teachers spent much more time teaching 
children the meaning of the words they were encountering in the text (P «0-001, 
sign test). When we looked at stops which were made to praise and encourage the 
child, however, the pattern was reversed. Here hearing children were more likely to 
encounter such a stop (P «0-02, sign test). 


The actual purpose of the reading lesson for the two groups was, then, quite 
different. The hearing child's lesson consisted mainly in stops for reading failure per se 
with other stops to provide positive reinforcement. Deaf children were equally likely 
to be stopped for reading failure but also stopped or were stopped to check that they 
actually knew the meanirig of words and, less frequently, for articulation training. 


Time spent in reading and reading rates 

Although deaf and hearing children were matched on the section of text read and, 
hence, read exactly the same number of words of print, the actual time spent in reading 
was markedly different for the two samples. Deaf children spent far longer in reading 
their text than their matched controls Wc 24-87, df = 1, 13, Р<0-00025). 


There are a number of possible reasons for this difference in reading time, 
reasons that hold out quite different implications for the nature of deaf and hearing 
children's experiences in the reading situation. In the first place, the extra time could 
be taken up by repeating words or sections of text at the teacher's request. It could 
be due to the time spent on the extra stops made in the reading lesson and/or the 
duration of those stops, or it might be due to differences in the reading rates of the 
two samples. 


Since the children read matched texts, a comparison of the total number of words 
actually articulated by the child in reading enables any differences in frequency of 
repetitions to be calculated. There was no difference between the two groups on this 
measure (Е = 0-64, df = 1, 13, Р<0-44). Teachers of the deaf did not demand that 
their children re-read sections of text any more frequently than teachers of the hearing. 


An examination of the proportion of time spent in actually decoding print into 
speech, however, did reveal a marked difference between deaf and hearing samples. 
The deaf spent relatively less time actually reading. (Е = 38:57, df — 1, 13, 
P«0-00003). The greater frequency of stops for the deaf children did contribute to 
the reading time—as might be expected. 
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А more important comparison lies between the actual reading rates of the two 
samples. The range of reading rates (i.e., rate when time for stops was excluded from 
the analysis) for the hearing sample ranged from 13-2 to 130-4 words per minute 
(mean 64-2 w.p.m.) The range for deaf children was 14 to 142 w.p.m. (mean 50:3 
w.p.m.) This difference in mean time just failed to achieve significance (F — 4-05, 
df = 1, 13, P «0-06), with the deaf reading relatively more slowly. 

This final result is an extremely important one. Given the deaf child's retarded 
language development, poor articulation and more frequent stops, it would not have 
been surprising to find them actually reading at a much slower rate than hearing 
matches. This, however, was not the case. The mean value of 50-3 w.p.m. lies inside 
the 40-120 w.p.m. rate at which speech is intelligible to a listener. On this measure, 
then, there is no indication of marked pathology in the reading situation for the deaf 
child. 

However, a post hoc examination of the reading rates of the group of deaf children 
revealed a marked difference in the reading speeds of the two sets of seven children 
drawn from the two schools. Children from one school (School А) read significantly 
faster than those from the other (В)—Е = 5-12, d£ = 1, 12, P<0-043. Taking 40 
w.p.m. as a cut-off for speech at an intelligible rate, six out of the seven children from 
school А achieved this criterion while none of the children from school B did so. 
We return to a consideration of these differences after the final item of analysis. 


Text effects 

Teachers of both the deaf and hearing children had matched child to text them- 
selves, They followed their normal practices for deciding what demands to put on 
their children in the reading situation. The texts varied quite markedly in difficulty. 

The list of texts used is shown in Table 1, against the age and hearing losses of the 
deaf children reading them. They range from first primers like 1, 2, 3 and Away 
designed for children at the very beginning of reading, through to a Puffin book, which 
lies beyond primer series altogether and demands an estimated reading age of around 
10 or 11 years. This last book was read by a deaf child with a mean hearing loss of 
108 and an IQ of 129. She read it at a mean reading rate of 87-5 words per minute in 
comparison with her hearing match's speed of 120-8 w.p.m. 

To throw some light into the question, “ Are teachers of the deaf simply moving 
children on through reading primers * mechanically ° with no associated development 
in the deaf child's reading? ”, the following calculation was performed. The hearing 
children were rank ordered on the basis of reading speeds. The fastest hearing child 
read at 130-4 w.p.m., the slowest at 13:2. The child reading at the fastest speed was 
reading from Wide Range Readers, Green Book 1, estimated to demand a reading age 
of around 8 or 84 years. The next fastest reader was the child reading the Puffin 
book at 120-8 w.p.m. The slowest hearing reader was tackling a Breakthrough book— 
The Christmas Tree—estimated to demand an initial reading level—nominally 5 to 54 
since it was a beginning primer. Generally speaking, the children taking the most 
advanced books were also reading at a faster rate. Thus teachers of the hearing 
children were providing children with more difficult books as they read faster. This is 
а logical procedure. By comparing the rank order of reading speed in the hearing 
children to the rank order reading speeds of their deaf matches, it was possible to see 
how far both sets of teachers — deaf and hearing—were following the same basic logic 
in matching the demands of the book presented to a child and his rate of reading. 
The correlation was high and significant (г, = 0-82, P «0-01, М = 14). 

The children from school A~-who were reading faster—were also somewhat 
older than those from School B. The reason for this was that school A do not start 
their children reading until they feel they have mastered a sufficient vocabulary and 
syntactic ability to tackle the reading process. Consequently, children from the two 
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TABLE 1 
TEXTS USED AND PROFILES OF DEAF CHILDREN READING EACH TEXT 











Deaf Child 
Age Hearing Loss | Reading Speed Estimated Reading Age 
Техі yrs months db W.p.m. demanded by Text 
Breakthrough: 
The Christmas Tree 83 73 22 5-5:5 years 
Griffin Pirate Readers 
The Three Pirates 7-1 52 19 5-5-5 years 
1, 2, 3 and Away 
Billy Blue Hat 6:7 105 23 5-5-5 years 
Happy Venture 
Book 1 6:11 91 14 5-5:5 years 
Griffin Pirate Readers 
Roderick the Red 6:6 83 20 5-5-6 years 
Racing to Read 
Book 3 75 100 38:7 6-6:5 years 
Racing to Read 
Book 4 8.10 106 30-5 6-6-5 years 
Silver Book 4 6:10 83 142 6-6-5 years 
Webster Readers 83 83 31 7-5-8 years 
Racing to Read 
Book 10 72 92 62 7-5-8 years 
Wide Range Readers 
Green Book 1 10.2 92 48 8-8-5 years 
Wide Range Readers 
Blue Book 1 8-5 86 87-5 8-8-5 years 
Wide Range Readers 
Green Book I 9-7 100 79-2 8-8-5 years 
Puffin Books 
Green Smoke 10:3 108 87.5 10-11 years 


schools had been reading for a similar length of time. Since children from school А 
were older and more advanced, however, they were presented with books designed 
for older hearing children. To see how far the children's maturity played a part in 
their reading rates, we correlated reading speed against children's mental age. This 
was done in preference to either age or intelligence in view of the relatively wide 
ranges of age and intelligence sampled. This correlation was also significant 
(г, = 0:82, Р<0-01, М = 11; IQs were not available for three of the children in the 
sample) Consequently, it seems that as the deaf child gets older he is presented with 
more difficult texts and, like (younger) hearing children, reads them out to this teacher 
at a faster rate. 

One final item of analysis was a comparison of the deaf children's reading speed 
with their hearing losses. This revealed a non-significant correlation. This should 
not be taken as evidence that hearing loss does not affect reading speed, however, 
since we had selected only severely-profoundly deaf children for our study (mean 
90 db). 


CONCLUSIONS 
This small-scale study has uncovered aspects of the deaf child’s experience in 
beginning reading lessons that are more markedly abnormal and other factors that 
seem relatively normal—if delayed in relation to hearing children. 
In the first place, it seems clear from our results that some teachers of the deaf 
use the reading lesson not simply as a vehicle for helping the child to discover relation- 
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ships between the spoken and written word but also as an opportunity for teaching 
language itself. The very frequent breaks given over to checking or trying to build 
up the child's vocabulary, together with the less frequent stops to clarify the child's 
articulation, led to а much longer lesson, and, occasionally, to lessons where children 
were being stopped every two or three words. In fact, seven of the 14 children whose 
lessons we observed were being stopped, on average, every four words or less. The 
overall rate of reading—stops included—dropped in some lessons to ten words or 
fewer per minute. 


It seems most doubtful that a child reading only a few words per minute, stopped 
frequently for vocabulary checks, can possibly be deriving much by way of connected 
language from his lesson. Where attention is frequently given to single words, and 
presentation of the story line, or even of longish phrases, seldom achieved, it seems that 
the best that can be learned i is a vocabulary for isolated words. Even when we take 
out the time given over to stops and look at reading speed alone, we found that eight 
of the children were still articulating at less than 40 words per minute, making it 
doubtful that they were deriving much sense from the structure of the text. 


The reading lesson, in short, means very different things for the deaf and hearing 
child. The hearing child is encountering vocabulary, syntactic devices and so forth, 
that are already present in his spontaneous language. This is not the case for the deaf 
child. We know from our own work and that of others (Wilbur and Quigley, 1975; 
Wood, in press) that deaf children of the age sampled here are unlikely to have mas- 
tered many of the syntactic devices found in even early primers and the frequent stops 
for vocabulary checks suggest that they often lacked even the basic word meanings 
onto which the written word could be mapped. 


However, in spite of these marked differences between the experiences of deaf and 
hearing children learning to read, one feature of the results suggests a more positive 
outcome of the deaf child’s reading experiences. The fact that the children reading 
the more advanced books also read at a faster rate, as the (younger) hearing children 
did, suggests that some reasonably comparable—if delayed—developments are taking 
place in the deaf sample. However, this result was associated with a schools difference. 


The children from school A were reading more difficult books than those from 
school B and reading them at a faster rate. The two schools differed in their general 
philosophy about teaching the deaf child how to read. Although the children in 
the two schools had been reading for similar lengths of time, those in school A had 
been introduced to reading lessons much later than those in school B. School B 
attempt to use the written word as a vehicle not only for teaching reading but also 
language itself. Hence, they begin to use the printed word quite early in the child’s 
schooling. School A on the other hand argue that the child cannot learn to read until 
he has mastered enough vocabulary and grammatical knowledge to enable him to 
translate the printed code into a phonetic one. Thus the fact that children at school A 
had had a similar length of reading teaching to those in school B and yet read more 
advanced texts faster suggests (but no more) that the early experiences of the children 
in school B may not be adding to their reading development. Indeed, given the 
frequent, long stops in reading and low overall reading rate, their children were 
arguably being confronted with a rather difficult and stultifying task. However, 
because the children in school À were also older, the schools' difference is confounded 
with mental age which also correlates with reading speed. 


If differences in teaching methods do exert an effect on reading development and 
motivation to read over-and-above normal maturation, then clearly the implications 
that follow would be important both for the theory of reading development and the 
practice of teaching reading to the deaf. However, our data are not rich enough to 
enable any partialling out of the effects of mental age and teaching method. On the 
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basis of our current evidence we can only point to the potential importance of more 
extensive studies of reading lessons guided by different pedagogical philosophies and 
conclude that in several important respects the deaf child's experiences in learning to 
read tend to be markedly different from those of the hearing child. But such differ- 
ences may be minimised by the teaching strategy adopted. 
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COMPETENCE AND PERFORMANCE VARIABLES IN THE 
ASSESSMENT OF FORMAL OPERATIONAL SKILLS 


ВХ А. M. SLATER AND DENISE J. KINGSTON 
(Department of Psychology, Washington Singer Laboratories, University of Exeter) 


SUMMARY. In two experiments seven-year-old children and university students were 
asked to judge the truth value of questions asked about the colour of counters that were 
either concealed in the experimenter's hand or shown to the subject. It was found that 
the children were able to demonstrate one of the major characteristics of formal oper- 
ational thougbt, namely the ability to reason in terms of verbally stated hypotheses 
without reliance on direct, physical experience. Specifically, they were able to evaluate 
correctly questions whose truth value depended on their logical form rather than from 
empirical considerations. However, under slighlty different testing conditions the 
students failed to answer the same questions correctly. Our results show that fluctu- 
ations in task demands can profoundly affect the manifestation in performance of 
& subject's underlying formal operational competence, and lend support to the view 
that some features of the later period are emerging during the concrete operations 
period. 


INTRODUCTION 


IN recent years there has been a mounting body of research concerned with Piaget's 
theoretical treatment of formal operations (Inhelder and Piaget, 1958). While there 
seems now to be widespread agreement that there develops, from adolescence onwards, 
& level of thinking that is qualitatively different from earlier stages there is considerable 
disagreement concerning its age of onset and its generality (Piaget, 1972; Neimark, 
1975a, 19755; Martorano, 1977; Capon and Kuhn, 1979). The various factors that 
contribute to the attainment of formal thought have, until recently, received relatively 
little attention. One important set of factors relates to the child's underlying compe- 
tence, and his actual performance on a particular task. This distinction has been 
made by (among others) Flavell and Wohlwill (1969) and Bolton (1972). They argue 
that there are two main determinants of a child's performance in a cognitive task. One 
is the child's competence—possession or not of the rules, skills or abilities which the 
task demands for its correct solution. The other is a set of performance variables 
related to the difficulty of the task, for example, clarity of instructions, memory and 
attentional demands, and familiarity and complexity of the stimulus materials. 
Danner and Day (1977) point out that the open-ended and relatively non-specific 
instructions that are typically used in formal operations tests can be so ambiguous 
that they fail to elicit optimal performance from adults known to possess the appropri- 
ate reasoning abilities (i.e., graduate students). This means that this type of research 
may be particularly prone to Type 2 errors, and Danner and Day suggest that much 
of the work on the development of formal operations “ overestimates the age of 
onset and underestimates the frequency of occurrence of these skills ” (p. 1600). 


The present study is concerned with the distinction between competence and 
performance, and addresses itself to one of the most important characteristics of 
formal, propositional thinking, namely the ability to function at the abstract level of 
logic and to reason in terms of verbally stated hypotheses, thus freeing the child from 
reliance upon the direct, physical evidence of experience. Osherson and Markman 
(1975) found that 7- and 8-year-old children had difficulties in evaluating the truth 
value of simple contradictions and tautologies. Their stimuli were poker chips of 
different colours, each chip being of one colour only. То assess the child's under- 
standing of contradictions and tautologies a chip was concealed in the experimenter's 
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hand, and the child was asked to say ‘true’, ‘ false’, or ‘ can’t tell’ to statements 
such as: (a) " the chip in my hand is white and it is not white’; (b) ‘ either the chip in 
my hand is yellow ог it is not yellow’. Statement (a) is an example of a contradictory 
sentence: it is false by virtue of its logical form. Statement (5) is a tautological sen- 
tence: it is true regardless of the colour of the chip. Osherson and Markman termed 
these types of statement non-empirical, since their truth value can be determined 
independently of observing the colour of the concealed chip. Few of their subjects 
answered these types of question correctly. However, when the children were 
presented with empirical statements, that is, questions whose truth value could only 
be determined by seeing the chip (for example, ‘ either the chip in my hand is red or it 
is not green’), they usually had little difficulty in giving the correct answer. They 
reported that " In general, the non-empirical items were substantially more difficult 
than their empirical counterparts . . . of comparable logical structure” (p. 217), 
and suggested that the children's difficulties with the former resulted from them 
treating the questions as if they were empirical ones, i.e., as if the colour had to be seen. 
The child's reliance on direct physical experience, and the apparent inability to 
function at the abstract level of logic fits well, of course, with Piaget's theorising. 


However, Osherson and Markman did not report giving the children experience 
with the types of question involved prior to testing. Additionally, they asked only a 
few поп-етарігіса! questions, randomly ordered among a much larger set of empirical 
ones. It is possible that this may have ‘ set’ the children to respond to all questions 
as if they were empirical. The combination of these variables may have substantially 
increased the difficulty of the performance variables. In order to distinguish properly 
between competence and performance variables we carried out two experiments. In 
the first, young children were given relevant experience with non-empirical and 
empirical questions prior to testing, and the test phase contained an equal number of 
the two types of question. In Experiment 2 university students were tested on similar 
questions, but with a larger number of empirical ones and without previous experience. 


METHOD 
EXPERIMENT 1 
The experiment was designed to test 7-year-old children's ability to determine the 
truth value of non-empirical (contradictory and tautological), and empirical questions. 


Subjects 
Twenty-one children were used, ten males and 11 females, mean age 7 years 
9 months. 


Materials 

The only materials were an array of small plastic counters, in five different colours 
(blue, red, yellow, green, white), taken from the popular board game ‘ Tiddlywinks °. 
Each counter was of one colour only. 


Procedure 

Each child was tested individually for approximately 35 minutes. Before testing 
began the (female) experimenter played a game of tiddlywinks with the child. This 
was intended both to establish a relaxed atmosphere and то generate interest in what 
was to follow. Atthe end of the game five counters, one each of the five colours, were 
placed on the table between the child and the experimenter. The remaining counters 
were hidden from the child's view, so that the experimenter could select one without 
the child knowing its colour until it was shown to him/her. The child was told that the 
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experimenter would ask some questions about the counters she would choose, and that 
in answer to each question he/she should say ‘ true’, ‘false’, or ‘can’t tell’. In order 
to ensure that the child understood what to do, and to give him/her some experience 
with the type of question to be asked, he/she was given pre-testing experience with the 
14 questions listed below. In describing the questions the colour of the counter, if 
shown, is given in parentheses after each question, as is the correct answer, and the 
term ‘hidden’ is used to indicate that the counter selected was concealed in the 
experimenter’s hand. 


The questions were: (1) How many different coloured tiddlywinks are there in 
front of you? с Five 7; (2) Can you tell me what colour this опе is without guessing? 
idden, ‘no’, or ‘ can't tell 7); (3) Is this true, false, or can't you tell: this one is 
red ? (Green, * false ^5; (4) Is this true, false or can't you tell: this one is red? (Red, 
‘true °’); (5) Can a tiddlywink be two different colours at once? (‘no’); (6) КТ said 
that this one is red and blue would you say that was true, false, or can't you tell? 
(Red, ‘ false’); (7) True, false, or can't you tell: this one is blue and not blue at the 
same time? (Blue, ‘ false"); (8) True, false, or can't you tell: this one is blue and 
not blue? (Green, ‘ false"); (9) This опе is green and not green? (Hidden, * false °); 
(10) True, false, or can't you tell: this one is any colour in the world? (Hidden, 
* false"); (11) This one is either red or yellow or blue or green or white (Hidden, 
true °); (12) This one is white or not white (Hidden, ‘ true 7); (13) This one is either 
yellow or not yellow (Yellow, * true’); (14) This one is either blue or not blue (Red, 
* true ?). 


The meaning of the logical words was explained if the child had difficulty 
in providing the correct answer. For example, when question (12) was reached it 
often had to be explained that any colour other than white constituted a * not white' 
colour. 


After this pre-test the experimental questions were presented. Аз the child 
answered each question he/she was rewarded with * good ' from the experimenter. If 
the child hesitated or appeared puzzled the question was repeated, but no feedback 
was given regarding the correctness of the answer. Ín order to maintain the child's 
interest he/she was told that ‘ if you do well’ he/she would receive a packet of Smarties 
at the end. In fact, regardless of the outcome each child was so rewarded. The 14 
experimental questions are given in Table 1, together with the correct answers, and the 
children's responses. Each question was asked twice: first with the counter concealed 
in the experimenter's hand (the * hidden ' condition); second, when the child gave his 
answer the counter was shown and the question repeated. To avoid order effects in 
presentation the questions were asked in a different randomly determined order for 
each child. 


A few points of detail are worth noting about the experimental] questions. The 
correct answer is, of course, the same in both the ‘ hidden’ and ‘ shown’ conditions 
for the non-empirical questions, but must always be ‘can’t tell’ in the ‘ hidden’ 
condition for the empirical items. As can be seen from Table 1 there are two tauto- 
logical questions and five contradictions. The reason for the difference is that the 
contradictions permit more permutations. Thus, for the tautologies (questions 1 and 
2, Table 1), and for the contradictory questions 3 and 4, only one colour is mentioned, 
and the colour in the " shown ' condition is either that colour or not. However, while 
the contradictory questions 5, 6 and 7 have the same logical status as contradictions, 
two colours are mentioned in the questions, and the colour in the * shown ' con- 
dition is either mentioned in the question (nos. 5 and 7) or not (no. 6). While it 
seemed worthwhile to include these variations in the contradictory questions 
subsequent analysis of the results showed them not to have an effect on the children's 
responses. 
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RESULTS 


The children's responses to all of the questions are given in the right-hand 
columns of Table 1. The correct answers can be found by reference to the immedi- 
ately preceding ‘ hidden’ and ‘ shown’ columns of the table. It can be seen that the 
children had little difficulty in answering both the non-empirical and empirical items 
correctly, in both the ‘ hidden’ and ‘ shown’ conditions. The percentage of children 
responding correctly to the non-empirical questions ranged from 71 per cent to 100 per 
cent (^ hidden’ condition) and from 76 per cent to 100 per cent in the ‘shown’ 
condition; to the empirical questions the percentage of correct responses ranged from 
76 per cent to 90 per cent (‘ hidden °), and from 81 per cent to 100 per cent (‘ shown’). 
That the similarity in percentage of correct responses from ‘ hidden’ to ‘ shown’ 
conditions did not result merely from repeating the first response given is indicated by 
the fact that the children, in order to be correct, had to change their responses to the 
empirical questions. 


Questions 1-4 (Table 1) were asked in the same form (in a ‘ hidden’ condition 
only) by Osherson and Markman, and the percentages of their subjects answering 
correctly were, for questions 1, 2, 3 and 4, respectively, 25 per cent (N = 51), 24 per 
cent (N = 25), 27 per cent (N = 51), 56 per cent (N = 25). The percentages for the 
present study were, respectively, 71, 71, 90, and 76 per cent (all Ns = 21). 22 tests, 
corrected for continuity, were carried out on the observed frequencies, to compare the 
children’s performance in the present study with that reported by Osherson and 
Markman. Three of the comparisons were significant: question 1, у2(1) = 11-35, 
P<0-001; question 2, y2(1) = 8:52, P «0-01; question 3, 721) = 21:33, P «0-001. 
Question 4 failed to reach significance (21) = 1:25, Р> 0:05) because of the relatively 
high levels of correct performance achieved by the children in both studies. 


EXPERIMENT 2 


Subjects 

Twenty university students, ten male and ten female, aged between 18 and 21 
years were involved. None of the subjects had received formal training in logic, but 
they ought all to be capable of reasonably advanced levels of formal operational 
ШЫР, 


Procedure 

The materials were the same coloured counters used in Experiment 1. Each 
subject was tested individually for approximately 15 minutes. Before testing began 
the subject was informed that he/she would be asked a number of questions, each 
about a counter held in the experimenter's hand. In each case the question pertained 
to the counter's colour, and was to be preceded by the interrogative * true, false, or 
can't you tell?’ The total set of items consisted of four non-empirical questions (nos. 
1-4, Table 1), and 30 empirical questions (nos. 8-14, Table 1, and 23 others, taken from 
Osherson and Markman). In order to minimise the time ‘required for each subject, 
and to follow Osherson and Markman’s procedure exactly, two distinct but over- 
lapping sets of items were administered, with two groups of ten subjects allocated to 
each set (for group 1 the set comprised two non-empirical questions, nos. 1 and 3, 
Table 1, and 18 empirical ones; for group 2 the set comprised the four non-empirical 
questions and 22 empirical ones). Order of presentation of the questions was ran- 
domised for each subject, and questions were repeated if the subject hesitated or 
appeared puzzled. 


In Experiment 1 each of the 14 test questions was asked twice, first in the ‘ hidden ' 
condition, with the counter concealed in the experimenter’s hand, and again with 
the counter shown. An important departure from this procedure in the present 
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experiment was that each question was asked once only: all four of the non-empirical 
questions in the * hidden ' condition only, and 26 of the 30 empirical questions in the 

‘shown ' condition only. Consideration of the effects this may have had on the results 
is given later. 


RESULTS 
Empirical questions 
The subjects, in general, had little difficulty in evaluating the empirical questions: 
expressed as a percentage the mean, median, and modal correct responses were 85, 
92-5, and 100, respectively. The adults’ high levels of performance are similar to the 
зева by the children of Experiment 1, and those reported by Osherson 
and Markman. 


Non-empirical questions 

The percentages of correct responses to the four questions (nos. 1-4, Table 1) are 
given below. For comparison purposes Osherson and Markman's results are given 
in parentheses. 


Question 1, 35 per cent, N — 20 (27 per cent, N — 51). 
Question 2, 30 per cent, N — 10 (56 per cent, N — 25). 
Question 3, 35 per cent, N — 20 (25 per cent, N — 51). 
Question 4, 30 per cent, М = 10 (24 per cent, М = 25). 


There was very little difference in the standard of performance on the questions 
between our adult subjects and Osherson and Markman's children. Neither of 
them scored highly on the non-empirical questions, and 22 tests carried out on the 
raw data for each of the four questions indicated no differences between them (all 
Ps 0-3, two-tailed). 

However, comparison of the performance of the adults in the present study with 
the children of Experiment 1 revealed differences on all but one of the questions, with 
the children performing better on the same questions than the adults (question 1, 
(1) = 4-1, P<0-05; question 2, х2(1) = 3:23, P>0-05; question 3, 741) = 11- 3, 
P<0-001; question 4, 141) = 43, P «0-05. All comparisons two-tailed). 


The adults’ failure to reason logically cannot be attributed to their failure to attain 
formal operational thought. On completion of their testing session the experimenter 
pointed out where and why each subject had gone wrong, and they invariably showed 
their increased understanding of the tautologies and contradictions, often by giving 
further examples of their errors. Thus, they were able to demonstrate their logical 
reasoning ability. 


DISCUSSION 


The results of Experiment 1 demonstrate that under appropriate testing conditions 
7-year-old children are able to answer non-empirical questions as easily as their 
empirical counterparts. The failure of Osherson and Markman to obtain this result 
undoubtedly stems from the testing procedure they used, rather than a lack of compe- 
tence by the children in this sort of task. То explore further the nature of the task 
difficulties, in Experiment 2 adult subjects were tested with similar items, using a 
procedure identical to that reported by Osherson and Markman. 


The results from the two experiments demonstrate both that 7-year-old children 
can, and adults often fail to, reason logically when presented with non-empirical 
questions of the type used here. Two major procedural differences might account for 
the different results found between Experiments 1 and 2. First, the children were 
tested in a relaxed atmosphere after having had some experience both with the testing 
materials and with the type of questions being asked. While the adults were not 
questioned in stressful conditions they were tested without this prior experience. The 


А. М. SLATER and DENISE J. KINGSTON 169 


second difference lies in the presentation of the questions. For the children half of 
the questions were non-empirical ones, and all questions were asked twice, in both 
‘hidden’ and ‘ shown’ conditions. For the adults each question was asked once, 
there was a much larger number of empirical questions, the majority of which were 
asked in the ‘ shown’ condition only, while the few non-empirical questions were all 
asked in the ‘hidden’ condition. It seems likely that the presentation of a few 
* hidden ° items among a larger number of ‘ shown’ items lead the subjects in Experi- 
ment 2 to an incorrect ‘ can’t tell’ response—i.e., they were “ set’ to respond to the 
non-empirical questions as if they required empirical verification. 


These results have two important implications for formal operations research. 
They show that fluctuations in task demands and task difficulty can profoundly affect 
the manifestation in performance of a subject’s underlying competence. Clearly, as 
others have pointed out (Danner and Day, 1977; Neimark, 1979; Rosenthal, 1979), 
researchers need to be sensitive to the potential effects of such performance variables, 
particularly with respect to the methodology of formal operations assessment. 


The second point is that the results of Experiment 1 demonstrate that 7- and 
8-year-old children can, under appropriate testing conditions, exhibit one of the major 
characteristics of formal operational thinking, namely, the ability to reason in terms of 
verbally stated hypotheses, independently of reliance on concrete objects or empirical 
evidence. This suggests that some features of the later stage are emerging during the 
concrete operations period, and supports other researchers’ suggestions that the age of 
onset of formal, propositional reasoning has been underestimated. 


Requests for reprints should be addressed to Dr. Alan M. Slater, Department of Psychology, 
Washington Singer Laboratories, University of Exeter, Exeter EX4 4QG. 
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TEACHING STYLES AND PUPIL PROGRESS: 
A. RE-ANALYSIS 


ву M. AITKIN, S. М. BENNETT AND JANE HESKETH* 


(Centre for Applied Statistics and Department of Educational Research, 
University of Lancaster) 


Summary. The object of the study was to examine the statistical techniques available 
for the analysis of process-product studies involving non-randomised quasi-experi- 
mental designs, and to demonstrate the practical effects of their use on the data from the 
Teaching Styles study (Bennett, 1976). Of particular concern were the * unit of analysis ° 
or aggregation problem, and the differential effects of treatment grouping by cluster 
and factor methods. 

The original grouping of teachers into formal, informal and mixed styles was 
investigated using a latent class model for the 38 binary questionnaire items. Convincing 
evidence of three overlapping latent classes was found. The comparison of latent classes 
in terms of pre-test gain scores was examined using a series of variance component 
models, allowing for correlation of children within the same class. Differences among 
classes were altered by the probabilistic clustering of the latent class model compared 
to the original findings, and the significance of the differences was reduced when the 
correlation among children was allowed for. 


INTRODUCTION 


IN the four years since the publication of Teaching Styles and Pupil Progress (subse- 
quently abbreviated to TS) there have been rapid developments in the statistical 
methods available for the analysis of complex data. While these developments are 
still in their early stages, it is already clear that they will have an important influence 
on the analysis of large-scale educational research studies. Two of these develop- 
ments are particularly important for the analysis of educational data from surveys and 
observational studies: the development of latent class models for clustering non- 
homogeneous populations, and the development of unbalanced variance component 
С mixed ") models for nested and cluster sampling structures. 


The objects of this article are to describe the application of these modelling 
procedures to the Teaching Styles data, to report the conclusions drawn, and to 
compare these conclusions with those found in the original analysis. Implications for 
future research studies are also discussed (for statistical detail see Aitkin et al., 1981). 
In the re-analysis, two main questions were considered: 


(1) Is there convincing statistical evidence of distinguishable teaching styles? 1f 
80, how many styles can be convincingly identified, and how can these be charac- 
terised ? 

(2) Is teaching style, as determined statistically above, related to overall pupil 


progress? 


THE EXISTENCE OF DISTINGUISHABLE TEACHING STYLES 


Cluster analysis 

The use of cluster analysis in educational research is increasing as researchers 
recognise the utility of grouping people rather than grouping variables. Barker 
Lunn’s (1970) study of streaming in the primary school was the first major investi- 
gation to use this approach, delineating two ‘ types’ of teaching closely conforming 
to the progressive-traditional dichotomy. The two most recent studies (Bennett, 
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1976; Galton et al., 1980) used identical cluster methods to delineate both teacher and 
pupil types although the data base was different. The method chosen was based on 
iterative relocation using a Euclidean distance metric. Nevertheless it was recognised 
in both studies that uncertainties about the method itself, for example the most 
appropriate similarity coefficient, should be reflected in cluster interpretation (cf. 
Bennett and Jordan, 1975; Galton et al., 1980, appendix 2c). 


Uncertainty about technique is perhaps best illustrated by the most recent Ameri- 
can study to adopt this approach. Solomon and Kendall (1979) cluster analysed data 
from 50 teachers and 1,200 pupils and in so doing they tried several cluster techniques 
— Q’ factor analysis, Linear Тура] analysis, Cluster build-up, Elementary Linkage 
analysis and a hierarchical method. They reported that although most provided six 
teacher clusters they produced somewhat different results. In order to overcome this 
they developed several sets of * core clusterings °, each started from the vantage point 
of one of the clustering methods. They then identified for each cluster those classes 
which also fell into the same group by at least two of the other clustering methods. 
Discriminant function analysis was then used to complete the cluster assignments. 


While researchers have been struggling with the practical application of clustering 
methods, statisticians have been considering their statistical foundations. Everitt 
(1977), for example, argued that “А fundamental problem in this area is the lack of a 
satisfactory definition of exactly what constitutes a cluster. Because of this, most 
clustering techniques cannot be formulated in terms of a satisfactory model... Most 
cluster analysis methods are essentially non-statistical in the sense that they have no 
associated distribution theory or significance tests, and so are unable to relate from 
sample to population...” Hartigan (1977) pointed out the sparsity of methods for 
establishing the ‘ reality’ of clustering: " The very large growth in clustering tech- 
niques and applications is not yet supported by development of statistical theory by 
which the clustering results may be evaluated ... There аге many guesses, сопјес- 
tures, analogies, and hopes, and only a few hard results.” Aitkin (1979) has also 
pointed out the unsatisfactory nature of clustering methods which are not based on a 
probability model: " How do we know that a particular configuration of clusters 
produced by a numerical algorithm would also have been produced by a different 
random sample from the same population, or by a different algorithm on the same 
sample? What confidence can be placed in the existence of real clusters?... The 
only methods of cluster analysis which allow formal statistical tests for the actual 
existence of clusters . . . are those based on mixture models . . ." 


Clustering methods based on probability models allow estimation and hypothesis 
testing within the framework of standard statistical theory. Though theoretical 
difficulties remain in deciding on the number of clusters, for a given number of clusters 
the assignment of individuals to clusters is based on standard likelihood ratio methods 
analogous to those used in discriminant analysis. 


Re-analysis 

In re-examining the existence of distinguishable teaching styles, a mixture or latent 
class probability model was adopted using the original 38 binary items from the 
teacher questionnaire. It is assumed that the population consists of k homogeneous 
subpopulations or latent classes of teachers, each class having a distinct teaching style. 
Each class is characterised by a set of 38 response probabilities, the probabilities of 
responding ‘ YES ' to each of the 38 binary items in Table 1. Given these probabilities, 
the probability that a teacher belongs to the j-th class is calculated from Bayes' 
theorem using the pattern of * Yes’ and * No’ responses for the teacher. Full details 
of the model and the method of estimation of the response probabilities are given in 
Aitkin and Bennett (1980). It is an important feature of this form of probabilistic 
clustering that it does not produce assignments of individuals to classes, but gives 
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instead the probability that each individual belongs to each latent class. This is 
preferable to a forma] assignment rule (as in discriminant analysis) which assigns each 
individual to the class to which he or she has the greatest probability of belonging, 
since this overstates the information available about cluster membership. 


Parameter estimates 

The parameter estimates (i.e., the maximum likelihood estimates of the response 
probabilities) for the two- and three-latent class models are shown in Table 1. The 
item number corresponds to that in TS (pp. 166-9), the number in parentheses next 
to the item number being the number of this item in Table 2 of Bennett and Jordan 
(1975). For the two-class model, the response probabilities marked t show large 


TABLE 1 
Two- AND THREE-LATENT CLASS PARAMETER ESTIMATES (100 x й) FOR TEACHER DATA 














Two-Class Model Three-Class Model 
Jtem Classi  Ciass2 Classi  Class2 Class 3 
1 (1) Pupils have choice in where to sit 22 43 20 44 33 
2 Pupils sit in groups of 3 or more 60 871 54 88 79 
3 (2) Pupils allocated to seating by ability 35 23 36 22 30 
4 Pupils stay in same seats for most of day 91 63t 91 52 89 
5 (3) Pupils not allowed freedom of movement in classroom 97 541 100 53 74 
6 Pupils not allowed to talk freely 89 48+ 94 50 61 
7 Pupils expected to ask permission to leave room 97 76t 96 69 95 
8 (4) Pupils expected to be quiet 82 421 92 39 56 
9 Monitors appointed for special jobs 85 67 90 70 69 
10 (5) Pupils taken out of school regularly 32 60 33 70 35 
11 Timetable used for о: g work 90 66+ 95 62 77 
12 Use own materials rather than text books 19 49 20 56 26 
13 Pupils expected to know tables by heart 92 76 97 80 75* 
14 Pupils asked to find own reference materials 29 37 28 39 34 
15 (6) Pupils given homework regularly 35 22 45 29 12* 
16 (0 t Teacher talks to whole class 71 44 73 37 62 
i) (8) Pupils work in groups on teacher tasks 29 42 24 45 38 
iii) (9) Pupils work in groups on work of own choice 15 46+ 13 59 20 
iv (10 Pupils work individually on teacher tasks 35 37 57 32 50 
v) (11) Pupils work individually on work of own choice 28 50 29 60 26* 
17 Explore concepts in number work 18 55t 14 62 34 
18 Encourage fluency in written English even if inaccurate 87 94 87 95 90 
19 (12) Pupils’ work marked or graded 43 141 50 16 20 
20 Spe and grammatical errors corrected 84 68 86 64 78 
21 13) Stars to pupils who produce best work 57 29 65 30 34 
22 9 Arithmetic tests given at least once а week 59 38 68 43 35* 
23 15) Spelling tests given at least once а week 73 51 83 56 46* 
24 End of term tests given 66 44 75 48 42* 
25 Many pupils who create discipline problems 09 09 07 01 18* 
26 Verbal reproof sufficient 97 95 98 99 91* 
27 (i) Discipline—extra work given 70 53 69 49 67 
п) (16) Smack 65 42 64 33 63 
Withdrawal of privileges 86 77 85 74 85 
av. Send to head teacher 24 17 21 13 28* 
v) (17) Send out of room 19 15 15 08 27* 
28 (1) (18) Emphasis on separate subject teaching 85 50+ 87 43 73 
ii) Emphasis on aesthetic subject teaching 55 63 53 61 63* 
(iii) (19) Emphasis on integrated subject teaching 22 65t 21 75 33 
À Estimated proportion of teachers in each class 0.538 | 0462 0-366 0-312 0:322 





* Indicates an item on which Class 3 is extreme 

+ Indicates an item with large differences in response probability between Classes 1 and 2 
Numbers in parentheses are item numbers in Bennett and Jordan (1975). 

£j, is the estimated probability that a teacher in class / responds ‘YES’ to item /. 
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differences between the classes, indicating systematic differences in behaviour on these 
items for teachers in the two latent classes. For the three-class model, the response 
probabilities for classes 1 and 2 are very close to those for the corresponding classes 
in the two-class model (though in most cases more widely separated), and the response 
probabilities for class 3 are mostly between those for classes 1 and 2, except for those 
items marked with an asterisk. Thus class 3 is to some extent intermediate between 
classes 1 and 2. 


Significance of latent class model 

Before attempting to interpret these results, we need to consider their statistical 
significance. Since this clustering method, like any other, will produce clusters with 
homogeneous random data, convincing evidence of the statistical significance of the 
latent-class clusters is needed. There are two sources for this evidence. 


First, the probabilities of cluster membership for all 468 teachers, for the two- 
class model are considered. In Figure 1 is shown a histogram of the 468 probabilities 
of cluster membership (arranged so that the larger of the two probabilities is shown). 
Of the 468 teachers, 257, or 55 per cent had a probability of 0-99 or more of belonging 
to one of the classes, and a further 83, or 18 per cent had a probability of 0-95-0-989. 
Thus 73 per cent of teachers had a probability of 0-95 or more of belonging to one of 
the classes. 

For comparison, 19 random samples of 468 from homogeneous populations were 
generated in which the 38 items were independent, and had the same marginal response 
rates as in the Т5 data. In these samples, on average 35 per cent of individuals had a 
probability of 0-95 or more of belonging to one of the two classes when a two-class 
model was fitted to the homogeneous data. Thus with homogeneous data about one 
third of individuals are assigned confidently to one of the two classes. This gives an 
indication of the apparent degree of clustering to be expected from random data. 
The degree of clustering in the TS data is very much greater than this, pointing to the 
real existence of distinct teaching styles. For the three-class model, there was а 
substantial drop in the proportion of teachers with high probabilities of membership 
in one of the classes. Only 30 per cent had probabilities of 0-99 or more, and another 
19 per cent had probabilities between 0-95 and 0-989. 

The second source of evidence is a formal test for significance. The test used was 
based on empirical simulations of the distribution of the test statistic (—2 log J, 
where / is the likelihood ratio) under the null hypothesis of a single homogeneous 
population. In 19 simulations of the value of —2 log /, the largest value obtained was 
84:4. In the 75 data, the value of —2 log / was 775:8, very much larger than the 
* critical value? above. The null hypothesis of a single homogeneous population can 
thus be rejected at the 5 per cent level in favour of the alternative hypothesis of two 
latent classes. The statistical test used is а permutation test: if the observed test 
statistic is larger than s simulation values of the test statistic under the null hypothesis, 
then the null hypothesis can be rejected at level 1/(5-- 1). 


The same test was used to assess the likelihood of three classes and this was also 
significant. Those findings, together with other data provided by Aitkin and Bennett 
(1980), indicate that three overlapping, rather than two distinct, styles exist. These 
are described below. 


Interpretation of classes 

Almost all Class 1 teachers restrict the movement and talk of children in the class- 
room, whilst a large majority organise their work by timetable, emphasise separate 
subject teaching and talk to the whole class. А majority have pupils working indi- 
vidually on teacher tasks. Class 2 teachers are much less restrictive in their classroom 
organisation, emphasise integrated subject teaching, and are likely to have pupils 
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working individually or in groups on work of their own choice. Marking or grading 
of pupils! work is very uncommon in Class 2. The identification of Class 1 with a 
formal, and Class 2 with an informal, teaching style (as these terms were used in 
TS) is very clear. 

Class 3 shares some of the characteristics of both the other classes. Like the 
formal teachers, their pupils stayed in the same seats for most of the day, were expected 
to ask permission to leave the room and were not taken out of school regularly, and 
the teachers used textbooks rather than their own materials, had similar teacher 
emphasis (Item 16) and similar disciplinary actions to the formal teachers. However, 
like the informal teachers, their pupils tended to sit in groups of three or more, they 
did not often mark or grade work, and did not give stars for good work. Amongst the 
three classes they placed greatest emphasis on aesthetic subject teaching. It is notable 
that the Class 3 teachers were lowest in expecting pupils to know their tables by heart, 
in giving homework regularly, in giving weekly arithmetic or spelling tests, and end of 
term tests. Eighteen per cent of these teachers had many pupils who created discipline 
problems, compared with only seven per cent of formal teachers and one per cent of 
informal teachers, and nine per cent found a verbal ко insufficient, compared 
with two per cent of formal, and one per cent of informal teachers. Sending children 
out of the room, or to the head teacher, were more common disciplinary measures for 
Class 3 than for either of the other two classes. 


While Class 3 shares some of the characteristics of each of the other two classes, 
and might therefore reasonably be called * mixed ", the disciplinary problems and the 
low frequency of testing and assessment give this class a somewhat different character 
from that of the mixed style in TS. 


Comparison with TS Clusters 

Since the result of probabilistic clustering is not an assignment to clusters but a 
set of probabilities of class membership, it is not easy to present a simple table com- 
paring the classes of teaching style for each clustering method. Two tables are 
presented. First, in Table 2 each teacher is formally assigned to the latent class to 
which he has the highest probability of belonging, and this assignment is compared 
with his membership in one of the 12 75 clusters. It should be noted that 78 teachers 
were not assigned to any of the 12 TS clusters, as they were not close to any of the 
12 cluster centroids. These teachers form the ‘ unclassified’ group in Table 2. 


It is clear from Table 2 that only TS Clusters 1 and 12 correspond closely to the 


TABLE 2 
LATENT CLASS ASSIGNMENT AND 75 CLUSTER MEMBERSHIP FOR 468 TEACHERS 











TS Claster 
1 2 3 4 5 6 7 8 9 10 1 12  Unclass Total 
* Formal ’ — = 5 9 4 2 9 20 19 24 31 36 16 175 
(Class 1) (0) ©) (21) (27) (15) (5) (30) (67) (63 (77) (79) (100) QD 
Latent * Mixed’ 1 13 11 2 7 26 19 6 14 6 8 — 31 144 
Class (Class 3) (3) (41) (46) (6) (27) (69) (63) (20) (39) (20) QD | (0) (39) 
* Informal ' 34 19 8 22 15 10 2 4 3 1 — — 31 149 
(Class 2) (97) (59) (33) (67) (58) (26) (7 аз (80 9 © © (40) 
Total 35 32 24 33 26 38 30 30 36 31 39 36 78 468 


The top entry is the number of teachers in each latent class who fall in the corresponding TS cluster, and the bottom 
entry is the percentage of teachers out of the total in this cluster. 
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latent classes (2 and 1 respectively). About 40 per cent of TS Cluster 2 teachers are in 
latent class 3, the ‘ mixed’ class, as are 20 per cent of TS Cluster 11 teachers. The 
remaining ТУ clusters are split across all three classes to varying degrees, the propor- 
tion of Class 1 teachers increasing, and of Class 2 teachers decreasing, fairly steadily 
from Cluster 1 to Cluster 12. Clusters 6 and 7 contain the greatest proportion of 
Class 3 teachers. 


The general pattern of Table 2 supports the ordering in TS from Cluster 1 to 
Cluster 12 of increasing formality, though as noted there (p. 47), clusters other 
than 1 and 12 “ contain both formal and informal elements." 


It was noted above that the formal assignment of teachers to latent classes over- 
states the information available from the probabilistic clustering. Since the con- 
clusions drawn about pupil progress in TS depend critically on the cluster membership 
of the 37 teachers, Table 3 considers the actual latent class membership for these 
teachers. Table 3 shows the probabilities of latent class membership for 36 of the 
teachers (one mixed TS style teacher could not be identified, and has been omitted 
from this table) for the three-class model. 


TABLE 3 
LATENT CLASS PROBABILITY AND TS STYLE CATEGORY FOR 36 TEACHERS— 
THREE CLASSES 














TS Style 
Formal Mixed Informal 
Latent 
Class 1 3 2 1 3 2 1 3 2 
100 — — 100 — — — — 100 
100 — — 100 — — — — 100 
99 01 — 70 30 — Ol 85 14 
99 01 — 12 88 — — — 100 
100 — — 44 49 07 — 03 97 
100 — — 01 98 01 — — 100 
92 08 — — 14 86 — 03 97 
100 — — 100 — — — — 100 
98 02 — 85 15 — —_ — 100 
100 — — 11 89 — — — 100 
71 — — 01 99 — 73 27 
94 06 — — — — — 36 64 
— — = — — = ~ 93 07 


The entries are the probabilities of latent class membership ( x 100) for the 
three-class model for 36 of the 37 teachers in 75 Chapter 5. 


The formal TS teachers, with one exception, have very high probabilities of 
belonging to Class 1. The one exception has a probability of 0:29 of belonging to 
Class 3, the ‘ mixed’ class. Nine of the 13 informal TS teachers have very high proba- 
bilities of belonging to Class 2, but three of the remaining four have high probabilities 
of belonging to Class 3, and the fourth is essentially unidentifed. The mixed Т5 
teachers are poorly identified: three clearly belong to Class 1, one to Class 2, and one 
to Class 3, while the remainder have substantial probabilities of belonging to two 
classes. 


Conclusion 

There is convincing statistical evidence, based on the latent class model, of three 
distinguishable but overlapping teaching styles. Two of these correspond closely to 
the broad classes * formal’ and ‘ informal’ as these terms were used in Т5. The third 
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class, called * mixed ’ here as in TS, is characterised by а low frequency of testing and 
assessment, and a relatively high frequency of disciplinary problems. The classifi- 
cation of the 36 teachers used in Т5 corresponds closely to the class membership 
probabilities for formal teachers, less so for informal teachers and poorly for mixed 
teachers. 


THE RELATION OF TEACHING STYLE TO PUPIL PROGRESS 


In Chapter 5 of TS the relation between teaching style and pupil progress was 
investigated using an analysis of covariance model. The analysis was based on the 
individual pre-test and test scores of each child, the children being classified by the 
teaching style (formal, mixed, informal) of the teacher. 


There has been considerable discussion in the educational research literature of 
the " unit of analysis" question: should the child or the classroom be treated as the 
‘unit’ on which statistical analysis is based? Gray and Satterly (1976) raised this 
question in their discussion of TS, and Bennett and Entwistle (1976) referred to it 
briefly in their reply. Satterly and Gray (1976) gave a more detailed discussion of some 
of the statistical issues involved, and recognised the need for a variance component 
model for the data. A ‘mixed’ or variance component model for ‘ clustered’ or 
* nested ' sample designs is developed below for the one-way analysis of covariance 
for pre-test/test situations. This model is then applied to the latent class membership 
assignment for the 36 teachers described in Section 1. The model is then adapted for 
the probability of latent class membership of the teacher. 


Variance component model for the analysis of covariance 

Let Y, denote the achievement test score, and x, the pe score, of the 
r-th child in the q-th classroom, taught by method p, where r = 1,. ed lh.. 
36, р = 1,2, 3, М = Eg All subsequent analyses will be based on те ‘or 
contractions of the variance component model: 


Тш Xp a T ES 


Here T, and Е», are mutually independent random variables, assumed to be normally 
distributed: 


T,~ МО, o?r), Е par ~ МО, 025), 


and р and у are the intercept and slope of the regression. 

The а, are constants with аз = 0 (so that the model is of full rank—alternatively, 
we could take Х рар = 0), representing the mean achievement differences between 
methods 1 and 2, and method 3. The slope of the regression of test score Y on pre-test 
score x i8 7, assumed to be the same within each teaching method. 


The T, are treated as random variables rather than fixed constants because the 
teachers have been selected from a population, and the interest is in modelling 
the variability of teachers in this population, rather than drawing inferences about the 
particular subset of teachers included in the sample. Teaching methods are repre- 
sented by fixed constants because they are the unique set of experimental ‘ treatments ' 
under examination. 


The properties of the above model are well known, and are described, for example, 
in Searle (1971, Chapters 9 and 10). А consequence of the random teacher effects is 
that the achievement scores of children within the same classroom are positively 
correlated: 

var (Кра) = Var (1, + Ега) = 027 +075 
COV(Y par Ура) = cov (1, + Еро Ta + Еш) = Var (Т) = а?т 


per? 
сопран Ypg) = p = o?5[(02, 025) 
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The intraclass correlation p may be large if 021 is large compared with o?,, and is 
zero only when 02; = 0, that is when there is no variation among teachers in the 
teacher population, which will rarely happen in practice. 


The above model may be extended to allow for pre-test by method interactions: 
it may happen that the slope of the regression of test score y on pre-test score x is 
different for different methods. A comparison of the methods then depends on the 
covariate value considered, and one method may be superior for low pre-test scores, 
while another is superior for high pre-test scores. The extended model is 


Уре = И+ ЎрХр t бр tT, Ens 


and the regressions are now p+ 71X1,,+ 01 for method 1, и + Y2X2 + 02 for method 2, 
and u+ y3xa,, for method 3. 


Unconditional conclusions about the relative superiority of one treatment to 
another are not possible in general with this extended model. Although methods are 
available for drawing conditional conclusions given the value of the pre-test score, this 
is not pursued further, as the interaction model will not be found necessary. 


In general, efficient (maximum likelihood) estimation of the parameters in the 
above models requires extensive iterative computation, even when the class sizes are 
equal. Several simpler methods are available which give consistent, but not efficient, 
estimates, and for which approximate ANOVA tables can be constructed. Three of 
these were applied to the TS data, both for internal comparisons, and for comparison 
with the efficient method. Discussion of these methods 18 given in Aitkin and Bennett 
(1980). The methods are summarised in terms of their estimation of the fixed effects 
as follows: I—ignore the random effects; II—unweighted class means; III-—class 
means weighted by sample size. 


For each method, an ANOVA table can be presented as follows. 


Source SS df MS 


Regression on pre-test x SS, 1 
Among methods, adjusted for pre-test SS, 2 MS, 
Method x pre-test interaction, 

adjusted for methods and pre-test SS, 2 MS, 
Residual variation among teachers SS, 31 MS, 
Within teachers, adjusted for pre-test 59; N-37 MS; 


The first four sums of squares are obtained by successive differencing of the residual 
sum of squares among teachers after fitting the appropriate parameters, This 
procedure is fully described in Aitkin and Bennett (1980). 


It should be noted that the sums of squares do not have distributions which are 
multiples ої у2, even when the class sizes are equal. If there are no covariates, the 
sums of squares have multiples of y? distributions if the class sizes are equal, and 
approximate y? distributions if the class sizes are not too unequal. 


Before describing the results of these methods, consideration of the effect of 
classroom formation on the conclusions to be drawn is needed. 


The effect of non-random assignment to classes 

The straightforward interpretation of teaching method differences would require 
the random assignment of children to classrooms, and the random assignment of 
teachers to teaching methods. The reality of the classroom formation in TS is very 
different. First, teachers were not randomly assigned to methods: rather, teachers 
with existing styles were assigned (independently of the 78 study) to intact classes. 
The greatest extent of randomness that could be hoped for is that the assignment of 
teachers was not based on the nature of pupils in the classes—that is, that teachers 
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recognised as ‘formal’ were not systematically assigned to classes which were below 
(or above) average on the pre-test. 


If there were evidence of such an assignment bias, it would be very difficult to 
draw general conclusions about differences in achievement between formal and infor- 
mal teaching styles used on pupils of the same initial achievement, for teaching style 
and initial achievement would be at least partly confounded. 


Since pupils were not randomly assigned to classes, it may be expected that the 
36 classes will differ systematically in their mean scores on the pre-test, such differences 
reflecting variation in the school populations, previous teachers and other systematic 
effects. The adjustment for the pre-test should then reduce the residual variation 
among teachers, and thus increase the sensitivity of the test for teaching style differ- 
ences, since the variation among teaching styles would not be reduced by the pre-test 
adjustment, if initial achievement and teaching style are not confounded. 


Thus we may expect that the ANOVA variance component model, when applied 
to the T'S study, will give interpretable results only if there are no systematic differences 
among teaching styles on the pre-test score. Even in this case, considerable care is 
needed in interpreting different styles as a cause of differential achievement. The data 
do not come from a randomised experiment, and there are many possible confounding 
variables. Discussions of such variables were given in TS, Gray and Satterly (1976) 
and Bennett and Entwistle (1976). 


With these cautions in mind, the results of the variance component models 
applied to the 75 data are considered in the next section. 


A. further difficulty, referred to several times previously, is that latent class 
membership is probabilistic, since class membership is not observable. An extended 
ANCOVA model incorporating latent variables is necessary to properly model the full 
data: such a model is considered later. 


ANCOVA results for the TS data 

The pre-test scores for reading, mathematics and English are first considered. 
А one-way classification variance component model is fitted to each of the pre-test 
scores, using the approximate ANOVA method for unequal class sizes. The ANOVA 
tables are shown in Table 4, based on complete data for 921 children (although 950 
children were analysed in TS, one complete classroom of 29 children was omitted in 
the re-analysis because the teacher's style could not be identified). 

















TABLE 4 
ANOVA or PRE-TEST SCORES 
Reading Mathematics English 
Source df SS MS SS MS SS MS 
Among styles 2 2,826 1,413 1,203 602 2,853 1,427 
Among classrooms , y ,659 
within styles 33 57,649 1,747 50,355 1,526 55,224 1,683 
Within teachers 885 163,540 185 106,227 120 139,293 157 
Variance component estimates 
82, 185 120 157 
lr 61 53 59 
0.25 0,31 0.27 
Means 
Formal 101-1 99-9 102-8 
Mixed 97-4 97:3 99-7 
Informal 97-7 97-9 99-1 dE 
7 X excute 
ft a ML—— 5 
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In all three cases, the among-styles mean square is less than the among-teacher 
within-styles mean square, во there is no evidence of association of style with pre-test 
score. The variance component estimates are also given in Table 4, based on the 
pooled style and within-style sums of squares. The correlation between children's 
pre-test scores within classrooms is moderate, and certainly not zero. 


This conclusion differs from that in TS and arises from the use of а different 
denominator in the F-tests used. In 7S, the within-styles mean square was used as the 
denominator for the F-test of among-style differences. Here the residual among 
classrooms within-styles mean square is used. In the variance component model the 
ratio of among-styles mean square to within-styles mean square does not have an 
F distribution unless с?т = 0, and its distribution depends on the ratio of the variance 
components 021/02, (see Aitkin and Bennett, 1980, for details). Since for all three 
test scores 627 = 0 is not tenable, the test for style differences must be based on the 
ratio of among-styles mean square to the among-classrooms within-styles mean square. 
The former ratios are all about 10, the latter are all less than 10. 


The ANCOVA variance component model was fitted to each of the test scores, and 
the ANOVA tables are shown in Table 5. Three tables are presented comparing the 
three methods on each test score. The analyses of variance and parameter estimates in 
Table 6 are fairly consistent over the three methods. 

It is clear that the variation among styles is quite small compared with that 
among teachers within styles. There are negligible style-by-pre-test interactions, 
though it is notable that Methods II and III consistently find larger interactions than 
Method I. These interactions are therefore pooled with the error term as indicated in 
Table 5. As expected, the residual variation among teachers on the test score has been 
substantially reduced compared with that on the pre-test after adjustment for the 
pre-test. However, the teaching style sums of squares are such that the small effects of 


TABLE 5 
ANCOVA oF Tesr Scores: LATENT CLASS ASSIGNMENT 














Method I Method II Method III 

Source df SS MS F MS Е 55 MS F 

(a) Reading 

Pre-test 1 132,985 78 38,590 

Among styles 2 527 263 0-39 17.5 8-8 0:34 530 265 0:38 

Рх 5 interaction 2 774 387 99-4 49.7 2,160 1,080 

Residual 33 25.8 698 
among teachers 31 21,644 698 726-8 24 20,180 673 

Within teachers 882 38,305 43 MN 

(b) Mathematics 

Pre-test 1 144,555 69,450 

Styles 2 972 486 0-99 103 0-63 670 335 0.80 

PxS 2 356 178 21:8 870 435 

Residual 33 492 164 418 
among teachers 31 15,892 513 4797 160 12,500 417 

Within teachers 882 44,376 50 um 

(c) English 

Pre-test 1 116,457 41,550 

Styles 2 1,186 593 2:02 16.3 131 982 491 1:67 

PxS 2 26 13 13-7 469 235 

Residual 33 } 12:4 294 
among teachers 31 9,675 312 123 8,939 298 

Within teachers 882 41,285 47 aa 
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TABLE 6 
ADJUSTED MEAN DIFFERENCES FOR TEACHING STYLES: LATENT CLASS ASSIGNMENT 





Reading Mathematics English 
Method I п ш І п ПІ І II n 
Formal 010 —0:35 0-03 1-09 0-61 0-79 1:49 124 134 
Mixed —112 —06  -108 -—162 -128 -—141 —141  -125 135 
Informal 191 1:02 1:04 0:52 0:66 0:61 —0-08 0:00 001 


Re ion 
cient 0:77 0:85 0:80 0:95 1418 1415 0-76 0-87 0:82 


(These estimates are obtained from &; and & in the ANCOVA model of $2-3, «3 being set to zero, by 
subtracting (81 + &2)/3, to give estimates which sum to zero.) 


different teaching styles are swamped by the variation among classrooms due to other 
systematic effects. The largest style effects are in English using Method I, but the 
F-value for among-styles compared with among-teachers is only 2:02, which is not 
significant. Table 6 shows the intercept differences, or adjusted mean differences, 
between styles on each test, from the model with no interactions. 


The direction of the differences is not consistent with those reported in TS, due 
to the change in class membership of teacher resulting from the different class assign- 
ment by the latent class model. The formal classrooms do best in English and slightly 
better in mathematics but the informal classes do best in reading. The mixed class- 
rooms do worst on all tests. These differences, though interesting, are not statistically 


significant. 


Latent class model for change 

Teaching style is not observable, but is estimated probabilistically from the 38 
binary behaviour variables. It has been shown that the mixed style, Class 3, was the 
lowest on all three tests, and Table 3 shows that there are very few teachers who are 
unequivocally assigned to this class—only two teachers have a probability of 0-9 or 
more of belonging to it, and three others a probability of 0-85 or more, though there 
are seven teachers assigned to this style by the assignment rule. 


The assessment of teaching style differences must allow for the certainty with 
which these style assignments are made. A reasonable procedure is to fit the 
ANCOVA model replacing the implicit (0, 1) dummy variables for teaching style 
membership by the /atent class membership probabilities. Thus if z, takes the value 1 if 
the child is taught by a teacher in latent class p, and 0 otherwise, then 21 is replaced 
by P (class |Х = x), and zz by P (class 2| X == x), where these probabilities are given 
in Table 3. Thus for the first teacher in the table, the dummy variables 2; and 22 take 
the values 1-00 and 0-00, as they do in the previous analysis. For the last teacher, 
21 and 22 take the values 0-00 and 0-07, these being the probabilities of membership in 
Classes 1 and 2 for this teacher. 


In the resulting ANCOVA model, a; and o» still have the same interpretations as 
the (Class 1-Class 3) and (Class 2-Class 3) mean differences; the change is only in the 
certainty of the identification of the class membership of each teacher. 


The use of the probabilities of style membership instead of the (0, 1) dummy 
variables, though reasonable, is only an approximation to the efficient maximum 
likelihood analysis. It is analogous to the use of estimated factor scores as predictors 
of a response variable, instead of the full maximum likelihood estimation of the 
parameters in the combined factor and regression model (such models are discussed. 
in Jóreskog and Goldberger (1975) and can be analysed using the LISREL package). 


р 
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TABLE 7 
ANCOVA or Test Scores: LATENT CLASS PROBABILITIES 














Method I Method II Method Ш 
Source df SS MS Е SS MS F SS MS F 

(a) Reading 
Pre test 1 132,985 1,778 38,590 
Styles 2 693 346 0:51 18-5 9:2 0:39 660 330 0:50 
PxS 2 629 314 1159 57-5 2431 2,470 1,235 14881 
Residual 33 674 

among teachers 31 21,623 698 710:2 23.7 19,740 658 
Within teachers 882 38,305 43 

tinteraction significant for informal group 

(b) Mathematics 
Pre-test 1 144,555 2,975 69,450 
Styles 2 1,731 865 1-84 36:0 18-0 123 1470 585 146 
PxS 2 298 149 524 26:2 178 1,040 520 
Residual 33 469 402 

among teachers 3i 15,191 490 4454 147 11,830 394 
Within teachers 882 
(c) English 
Pre-test 1 116,457 1,841 41,550 
Styles 2 1,858 929 3.39** 44-1 22.0 1-93 1,554 777 282* 
PxS 2 17 8 32-5 16-2 142 424 212 
Residual 33 274 276 

among teachers 31 9,013 291 353.9 114 8,359 279 


Within teachers 882 
* P«0-10 **P «085 





TABLE 8 
ADJUSTED MEAN DIFFERENCES FOR TEACHING STYLES: LATENT CLASS PROBABILITIES 














Reading Mathematics English 
Method I II ш І n In I II ш 
Formal 0:57 Зее text 1:65 0:9] 120 2:09 1-62 1.93 
Mixed — 1:75 for these: —278 | —208  —236  — 244 -192  --2:33 
Informal 119 interaction 143 1.16 1415 0.35 0:31 0-40 
Regression 
Coefficient || 0-77 — 0-95 118 1:14 0:76 0:86 0-81 





The results of fitting the model are shown in Tables 7 and 8. The F-distributions 
for variance ratio tests may be regarded as rough approximations since the probabilis- 
tic dummy variables are determined independently of the regression data. 

The significance of all style differences is substantially increased. For mathe- 
matics, the differences still do not reach significance, though the contrast between the 
mixed style and the formal and informal styles is more pronounced. For English, the 
differences by Method I reach significance at the 5 per cent level of Б. зз, but are not as 
large by the other two methods. For reading, the style by pre-test interaction with two 
degrees of freedom (df) contains one component with almost all the sum of squares: 
the contrast of informal with formal slopes. This single df term is significant at the 
5 per cent level of F; з; for Method H, and is almost significant for Method IIT. It is 
not significant by Method I. 
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TABLE 9 


READING TESr—PRE-TEST REGRESSIONS: 
LATENT CLASS PROBABILITIES 


Formal f= 414+1-02x 
Method II Mixed Y = 25-540-80x 
Informal Ё = 54:7--0:52х 


Formal P = 10:5-+0:96х 
Method Ш Mixed Ý = 22.9+0-82х 
Informal Ў = 58:9--0-48x 


The estimated regressions for the three groups for reading by Methods П апа Ш 
are shown in Table 9. The slope is greatest for formal, and least for informal styles, 
and the parameter estimates are very similar for the two ‘methods. For classes scoring 
low on the pre-test, the informal style has a much higher mean test score—an eight- 
point difference for the lowest class. The cross-over between formal and informal 
occurs at about x = 102, and classes scoring high on the pre-test do better under a 
formal style—for the highest class, the difference is six points. The mixed regression 
is in between, but is closer to the formal regression. 


Comparison with maximum likelihood estimation 

We consider finally the estimation of the parameters of the model by maximum like- 
lihood. Programmes Гог ML-estimation in the unbalanced mixed model are not widely 
available (BMDP has such a programme, but it was not implemented on UMRCC) 
so а GENSTAT macro was developed (by Dorothy Anderson). Tests of the hypoth- 
eses of no interaction or no style main effects are based on the likelihood ratio test, and 
Table 10 gives an ‘ analysis of deviance ’ table in which the entries have 72 distributions 
under the appropriate null hypothesis. This table should be compared to Table 7 
where the approximate ANOVA methods were used. 


None of the pre-test by style interactions is significant at the 10 per cent level. 
Reading, which had the largest interaction in the class mean models, has the smallest 
here. The main effect of English is significant at the 10 per cent level. No other 
effects are significant. Parameter estimates for the main effect models are given in 
Table 11. The pattern of mean differences is similar to that found by the ordinary 
least squares Method I, but the differences are smaller. 


It should be noted that the class mean method of parameter estimation results in 
a serious loss of efficiency. In the case of reading a misleading interaction appears, 
and the conclusions about the relative differences between styles are incorrect. Since 
the class mean method is based on only 36 ‘ observations ’, the possibility of random 
fluctuations among classes producing misleading results is quite high, and this method 
cannot be regarded as a satisfactory substitute for ML estimation when the number of 
classrooms is small. 


TABLE 10 
ANALYSIS OF DEVIANCE OF Test SCORES: LATENT CLASS PROBABILITIES 








Deviance 
Source df (а) Reading (Б) Mathematics (c) English 
Styles, adjusted 2 0.8 28 52* 
for pre-test 
PxS 2 0:2 19 40 


P«0-10 
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TABLE 11 
ADJUSTED MEAN DIFFERENCES FOR TEACHING 


STYLES BY RESTRICTED MAXIMUM LIKELIHOOD: 
LATENT CLASS PROBABILITIES 


Reading Mathematics English 





Formal 0-15 1-33 1-91 
Міхед -129 — 2:56 —2-18 
Informal 114 122 0-27 


Conclusion 

The teaching style differences in achievement which were found in 75 are modified 
by the re-analysis. There are two reasons for this. First, the analysis of covariance 
model which includes the random effect of teachers results in greatly reduced signifi- 
cance of any differences, because of the large random variation among teachers. 
Second, the clustering of teachers by the latent class model changes the nature of the 
differences between teaching styles. 


The only significant teaching style differences are in English, where the formal 
style has the highest mean, mixed the lowest, and informal is in the middle. In 
mathematics, the formal and informal styles are close, and substantially above the 
mixed style. In reading informal has the highest mean, mixed the lowest, and formal 
is in the middle. Though the differences may appear small, the four-point difference 
between formal and mixed in reading corresponds to a 6 to 8 months difference in 
reading age. It is of interest that the mixed style which was distinguished in the 
cluster analysis by a relatively high frequency of disciplinary problems, and by the 
lowest use of formal testing, gives consistently the worst results in the achievement 
model. 


RECOMMENDATIONS 


The re-analysis of the Т5 data discussed in this paper raises important issues for 
the design and analysis of future educational research studies of this kind. 


First, research designs using multi-stage sampling of schools and classrooms are 
natural and administratively feasible. The examination of intact classrooms for 
teacher or pupil differences does raise difficult statistical problems, but the formidable 
difficulties of randomised experiments in a school administrative context mean that 
non-randomised observational studies will remain important in educational research. 


When intact classrooms are the effective experimental or quasi-experimental unit, 
but outcomes are measured on pupils in the classroom, the correlation between pupils 
within a classroom must be allowed for by a suitable variance component model. 
Such models are necessary for multi-stage sampling procedures of all kinds in general 
survey designs. It should be clear from the discussion in Section 2 that the effective 
sample size for testing the significance of effects at the teacher level is the number of 
classrooms in the study, and this number should therefore be as large as possible: 
many classrooms with few pupils in each will give much greater power than few 
classrooms with many pupils. Financial constraints obviously impose a limit on the 
possible number of classrooms, but a small number of classrooms is likely to result 
in low power and the failure to find differences. Only tbe four-point difference in 
reading was statistically significant, but smaller differences than this are educationally 
significant. 

The approximate methods of analysis described in Section 2 are not satisfactory 
alternatives to full maximum likelihood estimation in the variance component model. 
In particular, the use of class means results in a very serious loss of efficiency in 
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estimating effects at the pupil level (for example, the regression of test on pre-test, or 
the size of sex differences). 


A major gap exists in statistical packages in this area: BMDP is the only package 
available which has a maximum likelihood programme, and implementation of this 
programme at UMRCC has been substantially delayed. Efficient methods for the 
analysis of multi-stage sample designs cannot become generally used without good 
general programmes. 


In non-randomised observational studies, many sources of potential bias are 
present. We cannot interpret effects of interest (like teaching style differences) in 
such studies as though they had arisen from properly randomised experimental studies. 
The best that can be done is to measure other possible confounding variables, 
and to allow for their effects through covariance analysis. This ‘ statistical control ’ 
is never a substitute for ' experimental control’ through randomisation. In our 
interpretation of teaching style effects, we noted that confounding of pre-test score 
with teaching style did not occur, and so tentative conclusions could be drawn about 
the ‘ effects ' of different styles. However there are many other possible confounding 
variables, some of which were discussed by Gray and Satterly (1976) and Bennett and 
Entwistle (1976), and in TS itself, and so the interpretation of different teaching styles 
as a cause of differential achievement should not be pushed too far. An important 
implication of non-randomised studies is the need for measurement of а large number 
of possible confounding variables, and the resulting complexity of the statistical 
models which need to be fitted. 


In the discussion of clustering in Section 1, great emphasis was placed on the 
latent class model. Latent variable models are essential in the analysis of studies of 
this kind, in which a large amount of information (38 items here) is available about 
each teacher, but the number of teachers used in the second stage is relatively small. 
It would not be possible to use the 38 items as explanatory variables in a regression of 
test score on pre-test and the items, for there are more items than teachers in the second 
stage. The items are treated as indicators of an underlying latent style of teaching, 
and so the dimensionality of the teacher information is reduced from 38 to two (the 
two dummy variables needed for three styles). 


Itis worth emphasising again that clustering methods must be based on statistical 
models if they are to have any validity. Cluster algorithms based on distance functions 
which bear no relation to the type of data considered, or to any probability consider- 
ations, cannot be expected to produce clusters which have any statistical validity. The 
probabilistic nature of cluster membership is an essential feature of the statistical 
model, and formal assignments to clusters from standard algorithms overstate the 
real information available from clustering. (In the re-analysis, the addition of the 
random error involved in producing a formal assignment actually reduces the differ- 
ences among styles.) It is important to note that effective clustering requires a sample 
size which is large relative to the number of descriptive items used. Ifthe sample size 
is not large relative to the number of items, the occurrence of multiple local maxima 
of the likelihood function indicates that there may be several different configurations 
of clusters which are equally well supported by the data. It would have been pointless 
to attempt to cluster a small sample of, say, 50 teachers using 38 binary items, or even 
ten items. Again financial constraints limit the possible sample size in observational 
studies. There are very few programmes available for probabilistic clustering. The 
normal mixture model was described by Wolfe (1970), and a FORTRAN listing for 
this model is given in the book on cluster analysis by Hartigan (Wiley, 1975), Good- 
man (1978, p. 468) gives a brief reference to a programme for maximum likelihood 
latent structure analysis. 


Simple macros in GLIM and GENSTAT can be written for ML estimation of 
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the parameters in general mixture models, including the latent class model of Section 
1. Programme listings can be obtained from the Centre for Applied Statistics at 
Lancaster. 


The final comments are on the importance of statistical computing and statistical 
modelling in the analysis of educational studies of this kind, and indeed of any studies 
involving multi-stage sampling. The statistical theory of the analysis of unbalanced 
mixed models has been established for at least 10 years, but only recently have any 
computer programmes been dee which are suitable for the analysis of such 
studies. Such programmes are still not generally available, and a pressing need 
exists for the development of general-purpose programmes or sub-routines which can 
handle the latent variable models on which clustering methods should be based. Such 
programmes are under development at Lancaster in the Complex Social Data research 
programme supported by the SSRC. 


The importance of statistical modelling is more general. The discussion of 
‘class’ versus ‘ pupil’ as the ' unit of measurement ’, can be resolved by answering 
the question, " What is the appropriate statistical model for data from a multi-stage 
sample?” This is even more important for cluster models, which are much less 
developed statistically. 
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FORMAL OR INFORMAL? A RE-ASSESSMENT OF 
THE BRITISH EVIDENCE 


By J. GRAY 
(Division of Education, Sheffield University) 


AND D. SATTERLY 
(School of Education, Bristol University) 


Summary. The original findings of the study Teaching Styles and Pupil Progress 
(Bennett, 1976) commanded widespread attention and generated considerable debate 
about the relative effectiveness of ‘ formal’ and ‘ informal’ teaching styles. This article 
is prompted by a number of developments. These include: recent re-analysis of the 
original data reported elsewhere in this issue (Aitkin ef al., 1981); some further re- 
analysis of the same data which we ourselves have undertaken; the existence of a number 
of other British studies which are directly relevant to these questions; and-the continuing 
dominance of the 'formal/informal' dichotomy in much discussion of research on 
teacher effectiveness. Our concern is to determine whether there is an overall trend 
in the British evidence in favour of one or other style and to reach some judgment about 
how important the formal-informal distinction might be in future studies of teacher 
effectiveness. 


THE LANCASTER RE-ANALYSIS 


Тне Aitkin et al. (1981) re-analysis of Teaching Styles and Pupil Progress (henceforth 
referred to as TSPP for short) makes an important contribution to our understanding 
of a number of methodological issues in the study of teacher effectiveness but leaves 
several others open for further research (cf., Gray and Satterly, 1976; Satterly and 
Gray, 1976; and Gray and Satterly, 1978). Since the re-analysis plays an important 
part in any re-assessment of the overall research evidence, however, certain aspects of 
its interpretation must be emphasised. It may be summarised as follows. First, the 
differences between teachers within styles were far greater than the differences between 
styles. Thus one found ‘ effective ' and ‘ ineffective ' teachers, no matter which teach- 
ing style they adopted. Second, differences between teaching styles were so small as 
to be overwhelmed by differences between other systematic effects. And third, the 
direction of the differences between teaching styles did not consistently favour more 
* formal’ approaches over " informal ' ones. 


We have plotted the findings from the Aitkin et al re-analysis alongside those of 
the original study in Figures 1, 2 and 3. There are, however, a number of obstacles to 
comparing the results of the two analyses directly. The latent class assignment pro- 
cedures employed in the re-analysis of the data from the questionnaire census produced 
different configurations of teaching styles. The categories ‘formal’ and ‘ informal ' 
corresponded to those in the earlier study, but this is scarcely surprising since the 
original questionnaire deliberately incorporated items taken from the Plowden Report 
and verified by teachers in 12 schools to differentiate * progressive ' from ‘ traditional ' 
approaches to teaching (Bennett, 1976, p. 38). The use of more sophisticated pro- 
cedures for classification, therefore, reproduced most strongly the original dichotomy 
built into the study. Interpretation of the ‘ mixed’ category is more problematic, 
however. As Aitkin remarks, this category " shared some of the characteristics of the 
other two classes’ but the ‘ disciplinary problems and low frequency of testing and 
assessment gave this class а somewhat different character from that of the mixed 
style in TSPP.' As an appendix to the full report comments, * there (was) no single 
continuum of teaching style' (Aitkin and Bennett, 1980, p. 50). Quite clearly the 
latent class assignment method used in the Aitkin re-analysis is to be preferred to the 
iterative relocation method used in TSPP but, because a number of teachers changed 
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FIGURE 1 
GAINS IN READING BY TEACHING STYLE: RESULTS OF 4 METHODS OF ANALYSIS 
2.0 


1.5 


+ 1,0 





о-----О Original TSPP results (Bennett, 1976, Ch. 5). 

е-----Ф Maximum likelihood method using probabilistic values for assignment of teachers to 
styles (Method IV) 

x x Average of methods I to Ш applied to probabilistic values. 

T—— 4 Re-analysis by Gray and Satterly of original TSPP results based on 36 teachers. 





FIGURE 2 
GAINS IN ENGLISH BY TEACHING STYLE: RESULTS OF 4 METHODS OF ANALYSIS 
2,0 





O————O Original TSPP results (Bennett, 1976, Ch. 5). . 
е-----Ф Maximum likelihood method using probabilistic values for assignment of teachers to 
styles (Method IV). | " 
x Average of methods I to Ш applied to probabilistic values. 
y of original TSPP results based on 36 teachers. 





x 
1— ——1 Re-analysis by Gray and Satter 
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FIGURE 3 
GAINS IN MATHEMATICS BY TEACHING STYLE: RESULTS OF 4 METHODS OF ANALYSIS 
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о-----О Original TSPP results (Bennett, 1976, Ch. 5). 

е-----Ф Maximum likelihood method using probabilistic values for assignment of teachers to 
styles (Method IV). 

x Average of methods I to III applied to probabilistic values. 

T Re-analysis by Gray and Satterly of original TSPP results based on 36 teachers. 


x 


t 


their class memberships, the analysis of the change data for the intensive sample of 
teachers inevitably produced different patterns of gains from those reported in Chapter 
5 of TSPP. 








THE GRAY AND SATTERLY RE-ANALYSIS 


We were interested in comparing the three styles as they had originally been 
constituted, but using statistical methods which took the variation among teachers 
within styles into account in assessing the effects of style. Accordingly, we repeated 
the analysis using 36 of the original 37 teachers with their original style memberships 
(the data for the one ‘ missing’ teacher were unavailable to us). Essentially, this 
analysis adopted a variance components model in which teachers within styles were 
treated as a random effect (Searle, 1971). Although the pattern of gains was more 
similar to those originally reported than those in the Aitkin re-analysis, the effects of 
styles were small and fell short of statistical significance (Reading: F(2,33) = 0-14; 
Mathematics: F(2,33) = 0:61; and English: F(2,33) = 0:58). The results of our 
re-analysis are also displayed in Figures 1, 2 and 3. 


The variance components model used in our analysis closely approximates 
Method II adopted by Aitkin et al. In passing we note that whilst their four models 
differ in their statistical efficiency, they all produced very similar patterns of results. 
In view of the similarity between the results obtained by these methods, it seems clear 
that researchers interested in relative statements about the superiority of styles could 
equally well employ the simple variance components model which recognises the 
hierarchical nature of the data and uses the usual least-squares estimates for the 
regression lines. More complex approaches are required when the absolute magnitude 
of the teaching style effects are sought as Aitkin et al. have argued. 


190 Formal or Informal? . 


THE OVERALL EVIDENCE FROM BRITISH STUDIES 


The findings from the TSPP re-analyses may now be integrated with those from 
severa] other British studies relating to the debate about ‘ formal’ and * informal’ 
methods. For the purposes of this review we confined ourselves to studies based on 
British primary schools. We also placed greater weight in our interpretation of the 
evidence on studies employing a pre/post-test research design where teachers and/or 
teaching styles were the major focus of interest. 


Three studies emerged as being most directly related to these purposes (Bennett, 
1976; Gray, 1979; and Galton and Simon, 1980). The details of Bennett's study will 
already be familiar, but those of the other two studies may not be. It 15 also im 
to bear in mind that none of the studies we considered operationalised | identical 
definitions of teaching styles; indeed a detailed analysis suggests that they related to 
each other only in the broadest sense. 


Galton and Simon's study was based on 58 teachers of 8- to 10-year-olds in three 
Jocal authorities. Six teaching styles were identified but using the iterative relocation 
method now rejected by Aitkin et al. as unsuitable for this purpose. The progress of 
children was monitored in a number of curriculum areas. None of the six teaching 
styles corresponded exactly to the formal-informal characteristics identified in TSPP 
(either in their original or re-analysed form). Indeed, the researchers were at pains to 
stress that their conception of teaching styles differed from this particular dichotomy 
in a number of important respects. They were aware, however, of the interest in 
making such comparisons and therefore assisted their readers in identifying which of 
their styles resembled the formal and informal ones of other studies. They suggest 
that their ‘ class enquirers” displayed most ‘formal’ characteristics and that their 
‘individual monitors’ were, in contrast, more ‘informal’ (Galton et al., 1980). 
There are a number of features of their research design and analysis, however, which 
complicate the interpretation of the results and whose influence on the overall pattern 
of results remains essentially unknown (Gray, 1980); for the purposes of the present 
review these will be ignored. 


Gray (1979) based his study on the teaching of reading by teachers of top infant 
children (6-- to 7--) in two outer London boroughs. Classes were studied during the 
first year of the project and tested for reading progress. The research design was then 
repeated again for the next cohort of children with teachers surviving from the first 
year of the study. The final analyses were based on 41 teachers and their classes. 
Teachers were observed on several occasions and a description of their classroom 
activities in terms of a number of dimensions was ‘ negotiated’ with them. The 
* informal ' and * quite informal ° categories in this study were based, in contrast to the 
other studies, on teachers' own understandings of these terms. 


The study was deliberately designed to incorporate certain features that would 
make the interpretation of differences between classes somewhat less problematic 
than in previous studies. Retrospectively, however, it is clear that it achieved these 
improvements at some expense; in particular, it focused exclusively on reading as this 
was the only area of the curriculum in which top infant teachers were believed to share 
largely common objectives. These and other issues are explored at greater length 
elsewhere (Gray, 1979). 


We present summary evidence from these three studies in Table 1. For each 
study we indicate the number of classes, the age-range of the children, the area(s) of 
basic skills tested and whether the results favoured a more formal approach when 
compared with a more informal one. In this last respect we indicate both the trend of 
the results and whether they were statistically significant. We shall consider, first, 
whether the trend of results favours more formal approaches, subject by subject, and 
then, subsequently, their statistical significance. 
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TABLE 1 
SUMMARY RESULTS FOR Basic SKILLS FROM THREE BRITISH STUDIES OF TEACHING STYLES 


Do results favour a 











more formal approach? 

Study and source No. of classes Area of basic skills | Trend Statistical 

and age-range significance 
Bennett (1976) 36 Reading no no 
Aitkin re-analysis (10+-11+) English yes no 
Maths no no 
Gray & Satterly re-analysis Reading yes no 
English yes no 
Maths yes no 
Galton and Simon (1980) 58 Reading | по по 
(8+-10+) Гев (English) yes no 
Ma yes no 
Gray (1979) 41 Reading 2 no no 
(64-74) Reading 2 yes no 





Notes: Selection of the studies is justified in the text. Statistical significance has been defined in the 
Galton and Simon and Gray studies using our approximation to Method II (see text) and employ- 
ing the conventional 5 per cent level. 


For reading, two of the results show a trend in favour of more formal approaches: 
Gray and Satterly (here) and Gray (1979, year 2). Both are, however, counter- 
balanced by other findings. There was no trend in Aitkin’s re-analysis in favour of 
more formal approaches (indeed, as Figure 1 shows the pattern was reversed with 
informal somewhat better than formal) nor in Gray's (year Г) or Galton and Simon's 
(1980) studies. For English, the trend in favour of more formal approaches in the 
Bennett re-analyses is supported by the finding from the Galton study. Maths also 
presents a similar story. Both the Galton study and the Gray and Satterly re-analysis 
suggest the results favour more formal approaches. Inspection of Figure 3 suggests 
that the Aitkin and Bennett re-analysis also marginally favours а more formal 
approach, although the reported size of the difference is very small. On balance, then, 
there would appear to be some evidence (albeit weak) that more formal approaches are 
more effective in raising scores in mathematics as well. 


None of the differences between formal and informal approaches, however, 
attains statistical significance at the 5 per cent level, although some of them begin to 
approach it. The relationship between sample size and the power of a statistical test is 
well known. It could be argued that some of the differences of the size shown in 
Table 1 would become statistically significant if the samples upon which they were 
based had been larger. Of course with very large samples even trivial differences can 
achieve statistical significance. Аз we have argued previously, studies of teacher 
effectiveness really require larger numbers of teachers to be sampled (Satterly and 
Gray, 1976, p. 12). None of the differences in Table 1 is, in fact, as large as three 
points (approximately four to six months difference in progress) on a typical stan- 
dardised test; if they bad been, then we are confident that they would also have been 
statistically significant. In fact most of the differences between formal and informal 
approaches averaged around halfthis size. Even if such differences were statistically 
significant we would be reluctant to describe them as educationally significant, given 
both the problems of possibly unexamined but confounding variables and the diffi- 
culties of persuading teachers who had already developed one style to adopt another. 


Against this view, it must be observed that, when the trend of results is examined, 
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seven out of 11 of the comparisons in Table 1 favour a more formal approach and 
only one of the 11 а more informal one. А number of researchers have commented 
that conventional standards for statistical significance place a heavy burden of proof 
on the researcher and that potentially interesting findings are, as a result, in danger of 
being dismissed (Carver, 1978). 


We examined a number of other British studies with a view to determining 
whether they provided confirmatory or contradictory evidence for these conclusions. 
All but one of these studies have also been reviewed by Anthony (1979) so readers may 
find it helpful to have a second opinion on them. 


The most important of the four studies is probably that by Barker Lunn (1970). 
As part of her research on streaming in junior schools she examined the effects of 
two teacher types (traditional and progressive) in non-streamed schools. She pre- 
sented her evidence for each teacher type broken down by social class and ability 
level (see Barker Lunn, 1970, Tables 5:6 a-c). For English the differences were modest 
and inconsistent. For maths the trend of the results favoured the more traditional 
type of teacher but the extent of the differences was, again, relatively small, rarely 
exceeding two standardised points. 


The study by Cane and Smithers (1971) of the teaching of reading around 1960 
in 12 infant schools serving disadvantaged areas claims that * more teacher-directed ' 
(i.e. more formal) schools were more successful in securing reading progress. This 
study has, however, been re-analysed and found to have a number of serious weak- 
nesses (Gray, 1975). The claims for statistical significance depended heavily on the 
results from one school. When this school was dropped from the analysis the statis- 
tical significance of the results (based on the individual pupil as the unit of analysis) 
was very substantially reduced. The trend of the results still favoured the ‘ teacher- 
directed ° schools, however, but closer inspection of the variables contributing to the 
construction of the ‘ teacher-directed ’ categories suggests that there were considerable 
problems in labelling them as such. More than half the variables employed to con- 
struct the categories seemed to have little to do with ' teacher-direction at all. 
They included such variables as: ‘ teacher experience ’, ‘ reception class experience ’, 
* age-range of class °’, " use of sentence method ' and * number of teachers in school’. 
We conclude that the study offers no reliable evidence to contradict the view already 
established with respect to reading. 


Anthony (1979) refers to a study by Kemp (1955) as offering support to the pro- 
gressive case. However, since the study was a cross-sectional one and the correlations 
between measures of attainment and progressiveness were never greater than 0:16, we 
concur with Kemp's own assessment of the relationship rather than Anthony's. 
Kemp remarks: " There is no evidence in this investigation that progressiveness is 
harmful in its effects on attainment, nor that it is particularly helpful " (p. 75). 


The study by Gardner (1966) offers some very limited evidence that more informal 
approaches may be more suitable for English and reading and more formal ones for 
arithmetic. We share Anthony’s doubts, however, about the extent to which the 
findings from this study may be generalised, since the schools in the sample were 
deliberately chosen to be ‘ good of their type’. 


Finally, a recent report by HM Inspectorate provides some limited details of the 
cross-sectional evidence collected in their survey of primary schools. They report 
that: ‘In classes where a didactic approach was mainly used, better NFER scores 
were achieved for reading and mathematics than in those classes using mainly explora- 
tory approaches ” (DES, 1978, р. 95). Unfortunately, they provide no evidence of 
how large the differences were nor whether the classes exposed to the two teaching 
styles were matched in most other respects. This finding is, therefore, of dubious 
utility in the present context. 
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In sum, we do not believe that any of this additional evidence conflicts with our 
earlier conclusions, which were that ‘formal’ teaching styles were probably unrelated 
to progress in reading and only modestly related to progress in English and maths. 
The apparent superiority of more ‘ formal’ approaches over ‘ informal’ ones in these 
Jatter areas needs to be tempered by the knowledge that the gains were пої statistically 
significant in conventional terms and that they were small. We doubt, however, 
whether they occurred by chance: the pattern of results seems more consistent than 
one dictated by chance events. 


These comments return us to the question we have addressed before, both here 
and elsewhere. When is a finding educationally as well as statistically significant? 
Given the somewhat rough-and-ready nature of quasi-experimental research designs 
and the problems of controlling adequately for additional and external factors, we 
incline to the conclusion that teaching style, defined in terms of the ‘ formal/informal’ 
dichotomy is not a central concept in the study of teacher effectiveness. 


FUTURE RESEARCH ON TEACHER EFFECTIVENESS 


If teaching style defined in this way is not the key to understanding teacher 
effectiveness, where should research be going? We have a large number of questions 
to ask but, given the stage which research in this field has reached, most of our 
observations are best viewed as suggestions rather than answers. 


First of all, we feel it is important to go back to the original data. We have 
displayed the residual gain scores from TSPP for each of the 36 teachers in each of the 
three subject areas (see Figure 4). These ‘ gains’ express the size of the difference 
between obtained and predicted (post-test) 1974 scores. They were obtained by fitting 
the regression line to the 36 pairs of observations, one pair per teacher. We 
acknowledge that this is, at best, a crude method for estimating the slope. Never- 
theless, the regression coefficients were 0-91 for maths, 0-84 for reading and 0-88 for 
English. As such they do not differ alarmingly from those used by Aitkin апу- 
where in his re-analysis, albeit from the latent class assignment memberships as dis- 
tinct from the original TSPP method of cluster analysis. 


Figure 4 prompts several questions. How is it, for example, that the class of in- 
formal teacher no. 6 made more progress in reading than the class of informal teacher 
no. 9 to the extent of 19 standardised points? Or that the class of informal teacher no. 
2 made 15 points more progress in reading than the class of informal teacher no. 
12? Indeed, what are we to make of the 24 points difference between teacher 
no. 20 (who according to the Aitkin re-analysis was probably informal rather than 
mixed) and teacher no. 9? We can obviously pose similar questions with respect to 
teachers from the other styles as well (no. 33 versus no. 36, for example, or no. 19 
versus no. 17). Estimates of the differences between teaching styles pale into insignifi- 
cance against this background. Is it just a question of being prepared for the 11-plus 
(the only other factor systematically pursued in TSPP) or are other factors at work? 


Similarly, Figure 4 poses questions across subjects as well as within them. How 
is it that the class of teacher no. 20 was so successful at reading and, to a lesser extent, 
at maths, but merely average at English? Are the skills involved in teaching English so 
dissimilar from those involved in teaching reading? Or was teacher no. 20 more 
worried about the performance of the children in reading and maths and, therefore, 
as a direct result concentrated rather narrowly and exclusively on them? What 
prompted the quite spectacular gain of teacher no. 7 in mathematics? These are 
important questions about why changes have taken place but ones which it is difficult 
to answer from TSPP or other studies which rely on broad clusters to investigate 
teacher effects. 


Part of the difficulty in answering such questions also stems from the way in 
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FIGURE 4 


SIZE ОЕ DIFFERENCE BETWEEN OBTAINED AND PREDICTED POST-TEST (1974) SCORES FOR 36 TEACHERS IN 
READING, ENGLISH AND MATHEMATICS BY ORIGINAL TEACHING STYLE 


Reading English Mathematics 
+15 7 
+14 
+13 20 
+12 
+11 
+10 
+9 
+8 6 
+7 
+6 2 7 7 26 26 
+5 19 14 21 29 
+4 25 32 33 6 25 27 20 29 34 
+3 8 16 35 16 19 
+2 28 2 32 12 16 21 30 32 
+1 1 10 24 31 34 35 3 5 19 25 28 33 
0 5 14 22 23 26 29 5 10 15 8 14 27 35 
-] 3 11 30 31 34 20 22 1 2 6 10 
—2 21 1 8 11 23 24 28 30 11 22 
-3 i8 27 3 33 9 15 23 24 31 
-4 4 12 13 18 36 4 18 36 
—5 9 17 
—6 4 15 17 17 
—7 13 13 
—8 36 
--9 12 
—10 


Key: Teachers numbered 1-13 were ‘informal’ in the original study; 14-24 were ‘ mixed’; 
and 25-36 were ' formal’. АП gain scores have been computed using our approximation to Method П 
and rounded to the nearest whole number. The data are taken from Aitkin and Bennett, 1980, p. 70. 


which research on teacher effectiveness is typically conducted. By the stage when the 
researchers know which are the most effective teachers (as defined by pupil progress) 
and which are probably the most appropriate questions to ask about them, they are 
beginning to write up their results. Aspects of the research design which go wrong, 
teachers who don't like being observed, classes which undergo а change of teacher 
during the year, children who fail to provide one or other of pre- or post-test scores 
and so on, compound the problems. In order to cope, researchers back hunches 
(that, for example, teaching style, defined in a particular way, is all-important) and 
live with them, treating everything else as irrelevant ‘ noise’, Some of these factors 
are, indeed, very difficult to predict and avoid. We believe, however, that through 
careful development of research designs it is possible to eliminate some potentially 
confounding variables. As far as we are concerned, the less dependent a study is on 
the vagaries of statistical controls the better. In quasi-experimental research, of course, 
such aspirations are more distant but we would maintain that considerable improve- 
ments have been practicable in all the major studies we have reviewed. 

We would also prefer a more open-ended style of research. First, considerable 
efforts would be devoted to establishing who were the ‘ more effective’ teachers and 
whether they were consistently effective from year-to-year. It would be important to 
know whether teachers who were more effective in one subject were equally effective 
in others. These are approaches that have already been tried in America by Brophy 
and Evertson (1976) and, on a more limited scale, by Gray (1979). But it is important 
to bear in mind that the highest correlation yet obtained for the stability of teacher 
effects is only 0:5 (Acland, 1976) and many are much lower, suggesting considerable 
variability from year-to-year in teacher effectiveness. 
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Having identified more stable and effective teachers, researchers would be in а 
better position to undertake research strategies in which competing explanations 
were given equal credence. Systematic and intensive observation would play a part 
but it might also be important to recognise that the skills of those who established the 
overall framework might not be appropriate to certain types of enquiry. There is 
considerable scope for the kind of case-study approach recommended by Stenhouse 
(1980) for which a different kind of research training and background is required. 
Of course such projects would become lengthier but we doubt whether, overall, they 
would prove much more expensive. Indeed, given that researchers are now recom- 
mending much larger samples of teachers than previously. (Cronbach, 1976; Satterly 
and Gray, 1976; Aitkin её al., 1981) such a strategy may be the only cost-effective 
procedure. 


Some estimates of the teacher effect 

The data from TSPP can be employed to provide some preliminary estimates of 
the results such a framework might produce for 10- and 11-year-olds and their teachers. 

We computed the correlations between pre- and post-test scores for each of the 
three subjects. For reading the correlation for the 36 classes was 0-83; for matbs, 
0-91; and for English, 0-90. These correlations tell us that, for each subject, knowing 
how well the classes performed at the beginning of the year provided a very good 
prediction of how well they would perform at the end of it. As we have already seen 
from Figure 4, however, there was still scope for teacher effects to operate. Assuming 
that all the differences were due to the teachers themselves (which is probably an 
over-optimistic assessment) then the most effective quarter of teachers differed from 
the least effective quarter by a minimum of six standardised points (a difference of 
approximately one year of progress) for each of the three subject areas. Gains of this 
size seem to us worth knowing about. 

We also compared the gains for each teacher across subjects. The correlations 
suggest that more effective teachers of maths were also likely to be more effective 
teachers of reading or English but the relationships were by no means uniform. The 
correlations between residual gain scores for the 36 teachers were as follows: English 
and maths 0:63; English and reading 0:52; maths and reading 0-50. In view of these 
correlations it may be necessary in future studies to distinguish to some extent between 
different subjects. It cannot be safely assumed that the factors which boost per- 
formance in one subject contribute to boosting performance in others as well. There 
are also other ways of defining teacher effectiveness which it would be useful to explore 
such as the effects of teachers upon particular sub-groups (e.g., poor readers). The 
insights potentially available from TSPP and similar studies are still far from exhausted. 


DISCUSSION AND CONCLUSION 


There is a tendency to race onwards to new (but only occasionally original) 
explanations of teacher effectiveness. Researchers need to recognise that the search 
for more effective teachers is likely to be a largely confirmatory one in which competing 
accounts are tested and, potentially, reconciled. The quality of classroom discourse 
to which pupils are exposed, the match or mismatch between what pupils are capable 
of and what their teachers provide, the amount of time that pupils are stimulated into 
spending on the tasks at hand, in contrast to the amount of time they spend being 
or feeling bored, and so on, may be competing explanations; but they might, equally 
well, be different sides of the same coin. If this is the case, we cannot afford to stress 
one at the expense of the others or to lose the effects associated with specific variables 
in broader clusters of teaching styles. 


It is unusual in educational research for debates to have conclusions; in conse- 
quence it is difficult to say whether this paper represents one or not. Even the present 
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re-analyses and discussion leave a series of important questions about the TSPP 
data and other studies unanswered. Tt would have been useful, for example, to know 
whether the same clusters of teaching styles and the same patterns of gains would have 
emerged if the 11-plus had been taken into account? (cf. Gray and Satterly, 1978). 
Claims for ‘informal’ methods are often made in terms of less easily measurable, 
non-cognitive outcomes which remain largely unexplored in British research. 


Whatever the conclusions to be drawn from the research, however, we cannot 
ignore the fact that the formal-informal, traditional-progressive debate continues to 
represent one of the more popular ways of speculating on the effectiveness of teachers, 
regardless of the cautions of researchers to the contrary. If the present re-analyses 
and discussion of teaching styles have merely shown the need to supplement these 
approaches with others, then they will have served their purpose. 
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PRODUCTIVITY OF SCHOOLS IN RELATION TO 
PROCESS AND STRUCTURE VARIABLES OF 
EDUCATIONAL ENVIRONMENT: А STUDY ОЕ 
ACHIEVEMENT IN GEOMETRY 


By P. SINGH 
(Khalsa College of Education, Amritsar, Punjab, India) 


Summary. А sample of 474 10th graders was drawn from 17 schools, governing а 
catchment area of one district in Punjab (India). Using process and structure variables 
as inputs and the criterion variable of achievement in geometry as an output, the 
schools were dichotomised as productive and under-productive on the basis of their 
predicted achievement in multivariate regression equations. The global and analytical 
picture of such schools was viewed in terms of process and structure variables associated 
with school, classroom, teachers and students. It was observed that productive schools 
were characterised by bigger size, higher pupil:teacher ratio and lower expenditure per 
student in terms of teachers' salaries. The variables of schools characteristics such as: 
curriculum press, methods of teaching and student teacher interaction in the classroom, 
curricular activities, school rules, regulations, policies and school traditions revealed a 
trend in favour of productive schools, which was further substantiated in the analytic 
picture of schools giving the highest and the lowest output. The teachers of productive 
schools were conspicuous by superior self-concept irrespective of their qualifications 
and experience. However, when these variables were made operative, experience 
favoured better attitudes toward ‘ professional growth’ and ‘school discipline’. The 
pupils of productive and under-productive schools did not differ in respect of their pre- 
vious academic background; however intelligence and socio-economic status emerged 
as differential variables. 


INTRODUCTION 


THE problem of differential academic achievement has been intriguing the mind of 
research workers since the very inception of mental measurement. Strenuous efforts 
have been made to explain this phenomenon by analysing different aspects of human 
personality, but the educational theorists and scientists are still at variance in the 
identification of all those variables which could explain the phenomenon of differential 
academic achievement in its entirety. In the 1950s and 1960s, a new dimension was 
added, which brought the term ‘ environment’ into prominence. А natural scientist 
conceives of environment as а complex of climatic, biotic and edaphic forces which 
interact with the organism and the ecological community at large. 


For a social scientist, however, environment is not merely ecological but a com- 
plex aggregate of social and cultural conditions. Educational environment is a part 
of this environment and is generated by the conditions, processes and socio-psycho- 
logical stimuli which interact with the child and affect his educational accomplish- 
ments (Dave, 1963). It is perhaps due to the conjoint interplay of this Е 
with process and structure variables that some institutions consistently show superior 
output in terms of students’ achievement. This is the issue upon which the present 
study has been focused. 


Dave (1963) and Anthony (1967) have tried to explore the influence of home and 
classroom environment process variables on the achievement of the child. Husen 
(1967) has reported a comparative picture of achievement in mathematics in different 
cultures by studying the interaction of variables associated with home, school, class- 
room and teachers. Keeves (1975) studied the relationship between process and 
structure variables of home and school, and achievement in maths. Seeking suste- 
nance from the conceptual endeavours of these studies, the present investigation was 
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undertaken with a view to examining the phenomenon of differential output among 
schools in terms of students’ achievement in geometry. In passing, it may be men- 
tioned that no effort has been made to isolate process from structure variables as it is 
the conjoint interplay of these variables which gives a unique personality to institutions 
and consequently affects their output. The process variables were assumed to be 
processes and forces which operate in home, school and classroom environments and 
which may influence the academic achievement of the child, whereas structure vari- 
ables were associated with the structure of home, school and classroom. 


METHOD 

Sample 

By resorting to stratified random sampling, a sample of 474 10th graders was 
drawn from17 schools, covering a catchment area of one complete district in Punjab 
(India). In schools where there was only one section of 10 graders all the students 
were included in the sample, whereas in schools having more than one section students 
were drawn from all the sections, by resorting to proportionate random sampling. 
The rationale of selecting students from all the sections in a school was to have a 
complete and composite picture of an institution. 


Materials 
(a) Geometric concepts test (GC test) 

То have an objective criterion of output of schools, the possible alternatives were: 
(i) to take the achievement of 10th graders based upon their school record, (ii) to accept 
the composite scores of different subjects, obtained by the students in the annual 
school board examination, (iii) to have the achievement of the students on a standard- 
ised test designed by the author. In the absence of an objective and uniform criterion 
of assessment in various schools, the first choice was not acceptable. The second 
possibility was also not free from logical and statistical incompatibilities, as the pooling 
of marks of different subjects obtained on incomparable, so called, interval scales of 
measurement without accounting for inter- and intra-examiner variability was a 
questionable measure to assess the output of schools. In view of these considerations 
preference was given to the third alternative. Teaching of plane geometry starts at the 
6th standard in high schools of Punjab and terminates at the 10th grade. Considering 
the sequential nature of the subject, it was felt that achievement in geometry could 
serve as an objective and valid measure with which to gauge the output of schools. 
Mass failure in the subject of maths in the final examination conducted by the state 
school board of education was another factor which had a special appeal for viewing 
the output of schools in terms of achievement in geometry. 

Exhaustive analysis of geometry textbooks prescribed for 10th graders and thorough 
discussion with subject experts led to the identification of 23 primary (basic) and 27 
secondary (applications of primary) concepts. Having conceived a broad framework 
of the test in terms of primary and secondary concepts, search was made for relevant 
test items. Works of Gulliksen (1950), Guilford (1954), Ross and Stanley (1954), 
Bloom (1956), Brumfield et al. (1962), Ebel (1966), Kelly and Ladd (1969), Payne and 
McMorris (1972) and Stanley and Hopkins (1972) proved to be of immense value in 
shaping and sharpening the conceptual framework of the test. The preliminary draft 
of the test comprised 70 multiple-choice test items having four choices in each item. 
The concepts along with the number of items representing each concept are reported 
in Table 1. 

The major consideration in deciding the number of items under each primary 
concept was the weighting given to the concept in the prescribed textbooks and 
frequency of occurrence of the primary concept in relation to secondary concepts. 
Consequently, triangle, circle, angles and locus were represented by 13, 11, 8 and 3 
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ТАВГЕ 1 


CONCEPTWISE ALLOCATION OF ITEMS INCLUDED IN 
THE PRELIMINARY DRAFT OF THE TEST 


Sr. No. Мате of the concept No. of items 


i Line 1 
2 Plane 1 
3 Solid 1 
4 Angle 8 
5 Simple closed curve 2 
6 Geometric figure 2 
7 Parallel lines 2 
8 "Triangle 13 
9 Convex figures 2 
10 Rhombus 2 
11 Square 2 
12 Cube 2 
13 Triangular prism 2 
14 Tetra-hedron 2 
15 Right circular cone 2 
16 Right circular cylinder 2 
17 Circle 11 
18 Projection of а pont 2 
19 Projection of a line 2 
20 Locus 3 
21 Cyclic quadrilateral 2 
22 Ratio and proportion 2 
23 Polygons 2 





М = 70 


items respectively (vide Table 1). АП other concepts were represented by two items 
each, save axiomatic concepts of line, plane etc., which were represented by one item 
each. The preliminary draft of the test was tried over a sample of 100 pupils. This 
trial helped in checking the language of the items and functioning of the distracters. 
Fifteen items wich were totally unattempted were dropped. 


For item analysis, a sample of 275 students was involved. The procedure of item 
analysis was basically the same as suggested by Dutton (1964) with slight modifications 
The index of item difficulty was decided by making use of the standard error of 
proportions. In a one-tail test the CRs of proportions of correct responses are 2:33 
and 1.65 at the 0-01 and 0-05 levels respectively. Thus for items having four choices, 
(М = 275, and SE = 0-0261) the proportions of correct responses indicating the 
real knowledge of the items beyond chance success were 0-312 and 0-294 at the 0:01 
and 0:05 levels respectively. If the proportions of items fell short of these values, they 
were considered to have been scored by chance. 


Item discrimination: Discriminating power of the items was decided by using 
Johanson's (1951) method of Upper-Lower Index. The significance of ULI was 
computed by * £' test. 


Кос Rr 
= ЗЕ 
where f == 0-27N and Ry and Кү indicate number of correct responses in the upper 
and lower 27 per cent of cases based on total scores. 


The items which discriminated negatively or failed to discriminate between the 
upper and lower 27 per cent of cases were modified or dropped. So, the GC test in the 
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final form consisted of 50 items covering 23 primary and 27 secondary concepts. 
The time limit for the test was decided when nearly 90 per cent of the students had 
completed the test. It was 60 minutes, including 5 minutes for instructions. 

Test reliability: Re-test reliability of the GC test was worked out for a sample of 
60 students (32 boys, 28 girls). The reliability coefficient after a gap of two weeks 
was found to be 0:84. 

Validity of the test: Content, predictive and criterion related validities were 
established. Content validation was done during test construction, whereas predictive 
validity was worked out for a sample of 110 students (75 boys, 35 girls). The first 
administration of the test was completed in December, 1975. These very students of 
the 10th grade were going to appear in the State school board examination in March, 
1976. The achievement scores on the GC test were correlated with the final achieve- 
ment scores in maths. The product moment ‘r’ between these scores was found to 
be --0-81. The criterion validity was established by correlating the achievement 
scores on the GC test with the achievement scores in maths, obtained by these students 
in the 9th grade annual examination. For М = 474 the coefficient of correlation 
between these two types of scores was found to be 0-345. 


(b) School characteristics index (SCI) 

To study the characteristics of Indian schools, the SCI was prepared on the lines 
of the CCI evolved by Stern and Pace (1958). The SCI consisted of 88 items classified 
into five categories named as: (i) Curriculum (SCIC;), (ii) Methods of teaching and 
student teacher interaction in the classroom (SCIC;), (iti) Curricular activities (SCIC3), 
(iv) School rules, regulations and policies (SCIC4) and (v) School traditions CU 
The split half reliabilities of these categories by the Rulon difference method were 
found to be 0-76, 0-66, 0:79, 0-79, 0-77 respectively for categories taken from 1 to 5. 


(c) Intelligence test 
Scores оп Raven's Standard Progressive Matrices were used as a measure of 


intelligence of the pupils. 


(d) School information form (SIF) 
SIF was designed to solicit information on the following variables associated with 


the schools: 


0 Total enrolment (school size) 
(2) Total number of teachers 

(3) Total expenditure on teachers’ salaries 

| Pupil:teacher ratio 

(5) Expenditure per student in terms of teachers’ salaries. 


(e) Teacher information form (TIF) 

TIF composed items associated with: teachers’ biodata, professional activities 
e.g., organisation of maths clubs, involvement in research and innovative practices, 
participation in summer institutes, refresher courses, seminars and workshops in 
maths, total teaching experience, use of teaching aids in maths instruction, regularity 
in giving and checking home assignments in maths, time devoted per week to maths, 
instruction to the 10th graders and class size, etc. However, information on the 
following items could also be recorded. 


(1) Marital status 

(2) Academic qualifications 
(3) Professional qualifications 
(4) Total teaching experience 
(5) Size of the class 
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All other items became redundant, as they failed to become variables. 


(f) Student information form (SsIF) 
SsIF was prepared to elicit information about the pupils and their family back- 
ground. It comprised three sections. 


Section One: Biodata 

Section Two: Family background variables. 

Section Three: This section consisted of items related to parents' involvement in 
pupil's education, their aspirations about the child and the child's 
own aspiration and interest in maths. 


Most of the items in this form also became redundant as they failed to become 
variables. Ап important revelation of this form was that poverty in general was a big 
hurdle in family activities and pupils were dependent upon parents for their future 
plans. The information on the following items could be satisfactorily recorded: 


(1) Age of the child. (2) Sex. (3) Family size. (4) Education of the father. 
(5) Education of the mother. (6*) Socio-economic status of the family (SES). 
(7) Availability of help to the child for doing home assignments. (8) Time devoted 
at home to study of maths. (9*) Previous academic achievement 9th grade (TAS). 
(10*) Maths achievement 9th grade (MAS). (Variables marked with an asterisk have 
been considered for viewing the analytic picture of productive schools.) 


(g) Teacher attitude scale 

This scale is а standardised instrument for measuring the attitude of teachers. 
The author of this scale claims that in the process of standardisation a combination of 
Likert and Thurstone techniques have been used. Initially 200 items were selected, 
which were rated by 50 judges in the first trial and 20 judges in the second trial. The 
items were scored on a six-point scale, viz., Strongly Agree (SA), Agree (A), Mildly 
Agree (MA), Mildly Disagree (MD), Disagree (D) and Strongly Disagree (SD). 
Scaling was done on the basis of a scale product technique by weighting each response 
category in usual Likert fashion, then multiplyng the same with the scale value of the 
statement. For positive items weighting on the six-point scale varied from 6 to 1, 
whereas for negative items the weighting was reversed. In the final form this scale 
consisted of 70 items, divided into seven sub-scales having 10 items on each. The 
seven independent scales were indicative of teachers' attitude toward: 


(1) Teaching Profession (ATP) 
б) Professional Growth (АСР) 
3) Teaching Methods (AMT) 
(4) Students (ASs) 
5) Curricular Activities (ACA) 
(6) School Discipline (ASD) 
(7) Self Perception (ASP). 
The split half reliabilities of the seven scales were: 0-58, 0-53, 0-70, 0-57, 0-73, 0-39, 
and 0:95 respectively. 
Validity i.e., ‘rs’ with the total score on Teacher Attitude Scale for the seven 
scales were: 0-71, 0-80, 0-73, 0-80, 0-62, 0-79 and 0-66 respectively. 


PROCEDURE 
The four instruments, SsIF, SCI, GC test, and the intelligence test were adminis- 
tered to 474 pupils drawn from 17 schools. The Attitude Scale and TIF were given to 
those maths teachers whose pupils were involved in the study. Information about the 
structure of schools was obtained through SIF. Scoring of the various instruments 
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involved was done schoolwise and teacherwise. Multivariate regression equations 
were set up using continuous and normally distributed variables as predictors. The 
criterion variable was the mean achievement score of the school on the GC test. The 
predictor variables were: ITS, SES, TAS, MAS, SCIC;, SCIC2, 5СІСз, SCIC, and 
SCICs. The schools revealing predicted achievement lower than the actual mean 
achievement were designated as productive, whereas those having predicted achieve- 
ment higher than the actual mean achievement were classed as under-productive. The 
analytical picture of productive and under-productive schools was studied taking 
process nd structure variables associated with the school, the classroom, the teachers 
and the pupils as inputs. The results of such analyses are reported in Tables 2 to 11. 


RESULTS 
Y. IDENTIFICATION OF PRODUCTIVE AND UNDER-PRODUCIIVE SCHOOLS 


Entering nine variables, viz., ITS, SES, TAS, MAS, SCIC;, СІС, SCICs, 
SCIC4 and SCIC; as independent predictors and mean achievement score on the 
GC test as criterion (output) 17 regression equations (one for each school) were 
set up identifying seven schools as productive and ten schools as under-productive. 


TABLE 2 
CORRELATION MATRIX (10 х 10) FOR THE CONTINUOUS VARIABLES FOR THE TOTAL SAMPLE (М = 474) 


ASG ITS SES TAS MAS SCIC; SCIC} SCIC} SCIC; SCIC, 
Variables — Xi Х; X3 Ха Xs Xe Хт Хз Хо Xio 





Xi 1:00 273 198 313 345 155 131 052 041 050 
X2 00 266 264 316 192 208 147 051 

X3 1:00 321 312 136 1 163 096 055 
Ха 1:00 804 237 170 099 044 168 
Xs 1-00 230 154 070 026 167 
Xe 100 369 422 370 466 
X; 190 429 237 445 
Xs 1-00 364 482 
Хо 1:00 511 
Xio 1:00 


Decimal points have been omitted. 
ASG stands for achievement scores on the GC Test. 
ITS stands for Intelligence Test Scores. 


The inter-correlations of predictors and criterion variable are reported in Table 2, 
whereas the actual achievement, predicted achievement, partial regression coefficients, 
mean values of predictor variables and the regression equation are presented in Table 3. 


П. ANALYTIC PICTURE OF PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS ON THE 
Basis OF STRUCTURE VARIABLES OF SCHOOLS 


The structure variables of schools considered were: total enrolment, total number 
of teachers, pupil:teacher ratio and expenditure per student in terms of teachers' 
salaries. The results of such analysis are reported in Tables 4 and 5. 


The global picture of productive and under-productive schools can be visualised 
from the data inserted in Table 4. The mean enrolment in productive schools is 
1,475, pupil:teacher ratio 34 and mean expenditure per student in terms of 
teachers' salaries stands at 17:22 rupees, whereas the mean values of these variables in 
under-productive schools respectively are 1,402, 28 and 21:95 rupees. These results 
are suggestive of a trend, characterising productive schools with bigger size, higher 
pupil:teacher ratio and lower expenditure, compared to under-productive schools. 
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TABLE 4 


STRUCTURE VARIABLES OF PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS 














Expenditure 
per student 
Actual Pre- Pupil: in term of 
School achieve- dicted Enrol- No. of Experience teacher teachers’ 
code ment Ach. ment teachers in years ratio salaries (Rs.) 
Productive Schools 
I* 21:91 19-62 1,045 22 8 47 7.70 
Q 21-89 18-72 2,356 99 4.5 24 25:00 
C 20-82 18-91 908 29 26-00 31 16-60 
Р 19.64 19-05 4,424 82 13-00 54 9-70 
E 18:45 17:14 446 32 20 00 14 22-40 
м 17-91 17.83 490 19 12-00 26 18-60 
р 17.23 16:04 655 18 5-00 36 10.58 
Mean Value M = 1,475 11.21 М = 34 М = 17:22 
Under-productive Schools 
F 18:32 18-67 1,503 78 13-50 19 32-00 
J 18-11 18-61 1,279 52 14-50 24 24-50 
Q 17-00 17:72 2,134 78 16-50 27 23:67 
K 17-70 19-34 1,129 58 11-00 19 26:50 
А 1795 17:40 1,412 28 11:00 50 9-47 
G 1748 17:53 1,828 45 7-00 41 12:32 
В 17-08 1773 2,001 81 5:00 25 25:30 
Н 16:31 17-86 512 18 30-00 28 20:00 
L 16:09 18:12 479 23 10 50 21 28-00 
N** 16:04 17-01 1,746 62 17.50 27 17-78 
Mean value М = 1,402 13-65 28 21-95 


ж School showing Ше highest output. 
** School showing the lowest output. 


With a view to a further probe into this trend four schools, two with the highest 
and two with the lowest mean achievement, were picked out from the 17 schools. 
The results of this analysis are reported in Table 5. 


From the results reported in Table 5 it is evident that schools having the highest 
and the lowest output differ significantly in their mean output scores, which is sub- 
stantiated by the significant ‘1’ of magnitude 6-71 (df = 81, P<0-01). The schools 
showing the highest output have mean enrolment of 1,700, pupil: teacher ratio as 28 
and 16:35 rupees as expenditure per student in terms of teachers' salaries, whereas 
these statistics for schools showing the lowest output are 1,112, 26 and 22:89 rupees 
respectively. These results are supportive of the trend observed earlier in the global 
picture of such schools. The conclusions arrived at in respect of school size and output 
are in line with Smith (1960), Gray (1961), Street et al. (1962), Husen (1967) and 
Wilson and Karen (1975). However, empirically the findings do not appear to con- 
form with Douglas (1931), Hoyt (1959) and Wiseman (1964). The results of school 
size and student performance should be interpreted with caution, as it is not merely 
the school size in itself which leads to superior output; maybe bigger sized schools are 
economically more viable, stable and capable of providing better educational facilities. 
Another peculiar trend revealed by the results of the above analysis is that the students 
of better paid teachers achieve less well. This seeming paradox, however, is resolved 
when the results inserted in Table 4 are studied carefully. The teachers in productive 
schools have less experience in terms of years of service, in comparison with the 
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teachers working in under-productive schools. Consequently the schools employing 
more experienced teachers have inflated salary budgets, boosting expenditure per 
student. 


ПІ. GLOBAL AND ANALYTIC PICTURE OF PRODUCTIVE AND UNDER-PRODUCTIVE 
SCHOOLS IN TERMS OF SCHOOL CHARACTERISTICS 

The global picture of productive and under-productive schools was viewed by 
pooling the mean achievement scores of schools on five categories of school character- 
istics. The results of such analysis have been set out in Table 6. 

A careful glance at the results inserted in Table 6 reveals that t — 3:85 (df — 472, 
P<0-01) is significant, which suggests that productive schools are characterised by 
higher ‘ press’ for curriculum. The ' 2" ratios for SCIC2, SCIC3, СІС and SCICs 
do not touch the 0-05 level of significance, yet these values reveal a trend in favour of 
productive schools, except in the case of SCIC, where the difference is negligible. 

This trend was further probed in the analytic picture of schools revealing the 
highest and the lowest output. The results of such analysis are reported in Table 7. 

The results set in Table 7 suggest that schools showing the highest and the lowest 
output differ significantly in all the five categories of school characteristics, which are 
supported by ‘+’ ratios of value 5:03, 2:07, 3-42, 2:43, and 3-24 in order of categories 


TABLE 6 
Роогер MEANS SDs AND ‘Е’ VALUES FOR PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS 


Categories of 
school 

characteristics Mi SD, Ni M; SD; N2 SE df E 
SCIC; 11:30 2-04 195 10-60 1-81 279 0182 472 3:85* 
SCIC; 13:66 2-68 195 13:40 2:19 279 0-232 472 1:13 
SCIC; 14-08 2-68 195 13-74 252 279 0244 472 1.39 
SCIC, 10-81 1-95 195 10-53 2:02 279 0-185 472 1:53 
SCICs 13-63 2:75 195 13:61 2:79 279 | 0:258 412 0077 


Mi, SD;, М! and Мо, SD2, N2 denote the pooled mean, standard deviation and number of 
pupils in productive and under-productive schools respectively. 
* Value is significant at 0-01 level. 


TABLE 7 


‘р’ VALUES FOR MEAN DIFFERENCES IN RESPECT OF CATEGORIES OF SCHOOL CHARACTERISTICS 
FOR SCHOOLS SHOWING THE HIGHEST AND THE LOWEST 


Mean values 





Categories of — ——————— Standard error | Mean differences у 
school Мн ML 1? 

characteristics М = 11 №=28 5Ен SE, Мн—Мь, df. ratio 
ЗСС 13:18 7:28 0:87 0-54 5:40 37 5-03* 
SCIC: 13-91 12-00 0:82 044 191 37 2:07** 
SCIC; 15-54 11-64 0-99 0:65 3:90 37 3.42* 
SCIC, 11:91 9:71 0:79 0-44 220 37 243** 
SCICs 1518 | 11:39 0-99 0:63 3-79 37 3:24* 


* Value is significant at 0.01 level. 
** Value is significant at 0 05 level. 
Ми, Mean value for the school revealing the highest output. 
M,, Mean value for school revealing the lowest output. 
№ and №, No. of students. 
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taken from 1 to 5. These results support the trend observed in the global picture of 
productive and under-productive schools, that is, productive schools tend to have a 
greater ‘ press’ for: Curriculum (SCIC;), Teaching methods and student teacher inter- 
action in the classroom (5СІС»), Curricular activities (SCIC3), School rules, regula- 
tions and policies (SCIC4) and School traditions (SCICs) than the under-productive 
schools. 


IV. ANALYSIS OF PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS ON THE BASIS OF 
TEACHER VARIABLES 
From a sample of 34 teachers whose students were involved in the study, infor- 
mation was sought through TIF and an attitude scale. TIF could solicit information 
about teachers’ qualifications both academic and professional and teaching experi- 
ence, whereas the attitude scale sought information on seven dimensions of teachers' 


TABLE 8 


SIGNIFICANCE OF MEAN DIFFERENCES OF TEACHERS’ ATTITUDE IN PRODUCTIVE AND 
UNDER-PRODUCTIVE SCHOOLS 





Sr. Attitude Mp 
No. categories Ni = 11 we = 223 М» —Мир SD, SDy, SE 't'value 
1 ATP 366-91 386-86 19-95 49-62 45:55 1772 1126 
2 APG 378-54 379-26 0-75 2446 3079 9-78 0-008 
3 ASs 328-73 315-74 12-99 26:97 38.27 11:25 1-155 
4 AMT 377-54 379-30 1-76 27-68 31:17 10:58 04166 
5 ASD 331-45 340-13 9-68 53.36 2785 17:10 0-566 
6 ACA 333-10 349-22 1612 48:83 29:57 1596 1-010 
7 АЗР 291:18 277.22 13-96 1584 2330 681  2049* 


* Value of * г" is significant at 0.05 level. 

Mp = Mean value in productive schools. 

Мур = Mean value in under-productive schools. 

№, No, No. of teachers in seven productive and 10 under-productive schools. 


attitudes namely: Teaching Profession (ATP), Professional Growth (APG), pupils 
(ASs), Methods of Teaching (АМ), School Discipline (ASD), Curricular Activities 
(ACA) and Self Perception (ASP). The results of such analyses are reported in 
Tables 8, 9 and 10. 


The results presented in Table 8 reveal that mean values on all the seven sub- 
scales of the teacher attitude scale in productive and under-productive schools do not 
differ significantly, save for self concept. The significant * ¢ ° value of 2:04 (df = 32, 
P<0-05) suggests that teachers in productive schools tend to have a better self- 
perception irrespective of their qualifications and experience. However, when teachers 
were dichotomised on the basis of qualifications (graduate and post-graduate), 
teachers in productive schools did not reveal any differential attitude on the seven 
sub-dimensions of the teachers’ attitude scale (vide Table 9), whereas in under- 
productive schools post-graduate teachers tended to have а better attitude toward 
Professional Growth (APG). This was substantiated by a significant "І" value of 
2-12 (df = 21, P<0-05). Further dichotomisation of teachers on the basis of ехрегі- 
ence (up to 10 years and above 10 years, vide Table 10) indicates that in productive 
School teachers above 10 years of teaching experience appear to have a better attitude 
toward Professional Growth (APG) and School Discipline (ASD) which was sup- 
ported by significant "І" values of 2:57 and 2:56, (df = 9, P 0-05) respectively. In 
under-productive schools experience favoured better Self-Perception of teachers. 
The analytical picture of productive and under-productive schools emerging from the 
above discussion is projecting teachers of productive schools as having better Self 


208 Productivity of Schools 


TABLE 9 


SIGNIFICANCE OF MEAN DIFFERENCES IN THE ATTITUDE OF GRADUATE AND POST- 
GRADUATE TEACHERS IN PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS 


























Productive schools 
qualifications No. of teachers 
Sr. No. Мс Mra Мо Neo df. ‘t’ value 
] ATP 336-60 368-83 5 6 9 0:23 
2 APG 369-40 386-20 5 6 9 0:77 
3 ASs 327 00 329-67 5 6 9 0-50 
4 AMT 394 00 363-83 5 6 9 1:95 
5 ASD 330-00 336-83 5 6 9 116 
6 АСА 344.20 232-83 5 6 9 0-65 
7 ASP 282:60 298-33 5 5 9 1:12 
Under-productive schools 
qualifications No. of teachers 
Ms Mro Ма Nec df. ‘t’ value 
1 ATP 378-00 409-00 14 9 21 186 
2 АРС 370-30 404-11 14 9 21 2412 
3 АМТ 321-10 317-11 14 9 21 0:34 
4 ASs 373-78 388-00 14 9 21 1-00 
5 ACA 345.43 330-78 14 9 21 1:21 
6 ASD 345 00 356-11 14 9 21 0.86 
7 АЗР 271-00 286-44 14 9 21 176 
t = 2-26 for 9 df is significant at 0-05 level. G: Means graduate. 
* f = 2-08 for 21 df is significant at 0-05 level. PG: post-graduate. 
TABLE 10 


SIGNIFICANCE OF MEAN DIFFERENCES IN THE ATTITUDE OF TEACHERS HAVING UP TO TEN AND 
ABOVE TEN YEARS OF TEACHING EXPERIENCE IN PRODUCTIVE AND ÜNDER-PRODUCTIVE SCHOOLS 




















Productive schools 

Experience 

Above Up to No. of teachers 

Sr. No ten years ten years РЕ: 

м! Ma М! № df. ‘r’ value 
1 ATP 364-30 366:50 7 4 9 071 
2 APG 395-00 349-75 7 4 9 2:57“ 
3 ASs 242-00 319-50 7 4 9 1:56 
4 AMT 381-71 370-25 7 4 9 0-60 
5 ASD 369-85 296 00 7 4 9 2:56* 
6 ACA 342-00 317-50 7 4 9 0-91 
7 ASP 300-10 275-50 7 4 9 1:40 

Under-productive schools 

experience 

Above Up to No. of teachers 

ten years ten years Е 

Mi M2 № № ЧЕ * t? value 
1 ATP 399-75 382-30 16 7 21 0:92 
2 APG 383-00 384-43 16 7 21 0-07 
ЗАМТ 376-81 385-00 16 7 21 0:53 
4 ASs 376-80 385-00 16 7 21 0-54 
5 ASD 388-50 342-43 16 7 21 0-30 
6 ACA 350-00 346-00 16 7 21 0 43 
7 ASP 283-01 263-40 16 7 21 2.14* 





t = 2-26 for 9 df. is significant at 0-05 level 


* += 2:08 for 21 df. is significant at 0:05 level. 
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Perception, although experience appears to favour better attitudes toward Profes- 
sional Growth and School Discipline. 


V. ANALYTIC PICTURE OF PRODUCTIVE AND UNDER-PRODUCTIVE SCHOOLS ON THE 
Basis OF STUDENT VARIABLES 


The variables of students considered as inputs were: ITS, SES, TAS and MAS. 
The mean values and standard errors and ‘ г" ratios for the mean differences of these 
variables for seven productive and ten under-productive schools have been entered 
in Table 11. 


The results reported in Table 11 show significant * г" ratios of 5:69 and 4-54 for 
mean values of ITS and SES (df = 472, P<0-01), whereas ‘t’ ratios for mean 
differences TAS and MAS are not significant. The conclusions that flow from these 











TABLE 11 
SIGNIFICANCE OF MEAN DIFFERENCES OF STUDE VARIABLES IN PRODUCTIVE AND UNDER-PRODUCTIVE 
CHOOLS 
SE of 
mean 
Variables Мр Ми SD» SDy Np № diff. df. ‘ft? 
ITS 39-29 34-81 9.96 10-83 195 279 0-964 472 5-69* 
5Е5 15-75 13-94 4:26 430 195 279 0:399 472 4 54* 
TAS 361-50 363-51 63-13 55:62 195 279 5-61 472 0351 


MAS 52-64 51:32 13.32 1222 195 279 1-20 472 1-10 
* Value is significant at 0-01 level. 


results are: the pupils of productive schools have superior intelligence and belong to 
families having better socio-economic status. However, the pupils of such schools 
do not differ in respect of previous academic background in general and maths in 
particular. 


CONCLUSIONS 


The study reported here was concerned with the examination of the phenomenon 
of differential output of schools in terms of achievement in geometry, considering 
process and structure variables as inputs with unitary values. The results are sugges- 
tive of the fact that the schools revealing the highest output are characterised by 
bigger size, higher pupil: teacher ratio and lower expenditure per student in terms of 
teachers’ salaries. Such schools are marked by greater curriculum ‘ press’, greater 
provision for curricular activities, teachers encouraging student participation in 
classroom teaching-learning situations, the authorities of such schools making the 
students conscious about school rules, regulations, policies and school traditions. The 
teachers of productive schools are conspicuous by better self perception irrespective 
of qualifications and experience. However, experience favours better attitudes 
foward professional growth and school discipline. The pupils of productive schools 
appear to possess superior intelligence and belong to families having better socio- 
economic status. However, they do not appear to differ in their previous academic 
background. 


The conclusions drawn on the basis of student variables should not be interpreted 
in isolation. Undoubtedly intelligence and SES are meaningful correlates of аса- 
demic achievement. But the phenomenon of differential performance in its entirety 
cannot be explained in terms of aforesaid variables alone. Individuals comparable 
in intellectual endowments, taught by the same teacher, have been found to exhibit 
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marked variations in academic achievement. This should neither be attributed to 
errors of measurement, nor to research artefacts: perhaps it is due to factors other 
than intelligence. The observations of Oates (1929) and Thorndike (1963) sub- 
stantiate this view. The conjoint interaction of intellective and non-intellective 
variables generates a specific educational environment leading to differential output 
in terms of academic achievement. Therefore, it would be erroneous to conclude that 
productive schools are giving superior output because the pupils of such schools are 
better economically placed and possess superior intelligence, whilst school, classroom 
and teacher variables have nothing to contribute. Maybe such schools are produc- 
tive because all these variables jointly interplay and generate an environment which is 
conducive to learning. 
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SIIMULATED RECALL: A METHOD FOR 
RESEARCH ON TEACHING 


By J. CALDERHEAD 
(Dept. of Educational Research, University of Lancaster) 


Summary. А growth in research on teachers’ ‘interactive’ thoughts and decision- 
making has led to the use of the research method of stimulated recall. The method has 
been employed in a number of different forms, but generally involves the replay of 
videotape or audiotape of a teacher's lesson in order to stimulate a commentary upon 
the teacher's thought processes at the time. The appropriate use of the method, the 
variety of ways in which it has been employed and their advantages and disadvantages 
are considered together with an examination of the status and validity of recalled thoughts 
and the problems of interpretation. It is concluded that although questions of validity 
cannot be completely resolved the technique presents a systematic approach to the 
collection of data potentially useful in research on teaching. 


INTRODUCTION 


GAINING access to the thoughts and decision-making of others is intrinsic to the 
endeavour of many social scientists. However, the difficulty in obtaining such access 
and the untestable validity of reported thoughts have often resulted in a greater 
research focus upon more easily identified, and more reliable, behavioural measures of 
human activity (see, for example, the debate between Radford, 1974 and Hebb, 1974, 
concerning the value of introspective evidence). 


In research on teaching, the methods which have most frequently been used to 
describe teaching processes in a naturalistic setting are systematic observation (e.g., 
Bennett and McNamara, 1980) and participant observation (e.g., Stubbs and Dela- 
mont, 1976). The former typically involves an observer counting the incidence of 
particular classroom behaviours using a set of predetermined categories such as 
“teacher questions’ or ‘teacher praise’, whereas the latter involves an observer 
attempting, through lengthy periods of observation and interview, to empathise with 
the participants and describe their unique perspectives. However, the restricted form 
of description provided by systematic observation (see McIntyre and MacLeod, 1978) 
and the limited insights resulting from participant observation (McNamara, 1980) 
have left many gaps in our knowledge of teaching processes. 

Teaching, as indicated by Hirst (1971) and Wilson (1972), is а goal-directed 
activity, a feature which has come to be strongly emphasised in competency models of 
teaching (Medley, 1979) and in the growth of research on teachers’ cognitions (Clark 
and Yinger, 1979). Consequently one might argue that any adequate description of 
teaching processes must view teaching behaviour in the context of teachers’ aims, 
goals or intentions. ; 


THE NATURE OF TEACHERS’ GOALS 


Although questionnaires and interviews have been used to assess teachers’ aims, 
goals, and objectives (Taylor, 1970; Ashton et al., 1975), such variables have generally 
been measured independently of classroom interaction; these measures represent 
goals at a high level of abstraction (general statements of purpose) and their relation- 
ship to classroom behaviour has not been elaborated. It may be unrealistic to suppose 
that the goals or aims which a teacher has in mind before entering a classroom to give 
a lesson are the sole, or even the major, determinants of that teacher’s classroom 
behaviour. During the two hundred or so interactions in which the primary school 
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teacher is engaged with her pupils every hour, the teacher's general goals may or may 
not remain constant, but the intentions of the teacher, or the functions which the 
teacher's behaviour serves, may vary considerably from moment to moment, depend- 
ing upon such features as the children's behaviour and their reactions to the teacher 
and task. In such circumstances, the identification of teacher's thoughts and decision- 
making—the reasons they have for acting as they do—could provide useful, or even 
essential information in the description of teaching processes, and it is in attempting to 
achieve such descriptions that stimulated recall has recently come to be used in natural- 
istic, classroom-based research. 


WHAT IS STIMULATED RECALL? 


The term ‘stimulated recall’ has been used to denote a variety of techniques. 
Typically, it involves the use of audiotapes or videotapes of skilled behaviour, which 
are used to aid a participant's recall of his thought processes at the time of that 
behaviour. Technical considerations in the use of stimulated recall techniques in 
classroom contexts are described in Conners (1978). It is assumed that the cues 
provided by the audiotape or videotape will enable the participant to ‘ relive’ the 
episode to the extent of being able to provide, in retrospect, an accurate verbalised 
account of his original thought processes, provided that all the relevant ideas which 
inform an episode are accessible. 


However, the stimulated recall method has taken slightly different forms in 
different research contexts. Its first use is often attributed to Bloom (1953) who used 
audiotapes of lectures and discussions to play back to university students for com- 
mentaries of their thoughts, in an attempt to investigate thought processes in these 
two different learning situations. The students’ reported thoughts were later cate- 
gorised according to their content and their relevance to the subject matter being 
studied, and comparisons were made between the two learning situations in terms of 
these features. 


Kagan et al. (1963, 1967) also developed a form of stimulated recall using 
videotape, named Interpersonal Process Recall (IPR), as a means of increasing 
counsellors’ awareness of interpersonal interactions during counselling interviews. 
Elstein et al. (1972) used stimulated recall in research on clinical decision-making, 
attempting to identify the thought processes of clinicians in simulated diagnostic 
situations. А similar introspective technique, sometimes referred to as * protocol 
analysis °’, has been used in contexts where the participants are not involved in inter- 
action with others. Та this procedure, participants are instructed to provide running 
commentaries or to verbalise their thoughts while engaged in skilled behaviour. The 
technique has been used by deGroot (1965) as a means of describing the strategies of 
chess masters, by Burgoyne (1975) in studying students' evaluations of their courses, 
and by Peterson and Clark (1978) to study the thought processes of teachers in their 
pre-lesson planning. 


RESEARCH ON TEACHING 


In the case of classroom-based research, several studies have recently adopted 
the stimulated recall technique to investigate the thought processes and decision- 
making of teachers while teaching (Peterson and Clark, 1978; McKay and Marland, 
1978; Calderhead, 1979; McNair and Joyce, 1979; King, 1980) The findings, 
which constitute a new source of knowledge about teaching, are reviewed elsewhere 
(Calderhead, in preparation). However, the rapid increase in the use of the technique, 
its use in a number of different forms, and the emergence of several questions con- 
cerning the influences and biases upon stimulated recall data, add to the importance 
of assessing both the advantages and disadvantages of alternative approaches and the 
status to be attributed to the resulting data. 
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The factors which may determine the significance or status of stimulated recall 
data can be classified into three main categories. Firstly, several factors may influence 
the extent to which teachers recall and report their thoughts. Fuller and Manning 
(1973), for example, point out that for most teachers viewing a videotape of one of their 
lessons is a stressful, anxiety-provoking experience. One might reasonably argue that 
8 teacher's level of anxiety, or alternatively her/his level of confidence in her/his own 
teaching, may influence this recall or the extent to which she/he is prepared to report 
it. In addition, the limitations of visual cues in aiding the recall of thoughts were 
pointed out by Bloom (1953) who suggested that each individual perceives a unique 
set of visual cues which may or may not be recorded by the researchers. Fuller and 
Manning elaborate on a similar point in suggesting that teachers viewing videotapes 
of their lessons are perceiving the Jesson again and from a different perspective and 
tend to be distracted, at least initially, by their own physical characteristics. However, 
the problems presented by anxiety, confidence levels and distraction may well be 
surmountable: for example, users of the technique suggest that the establishment of 
rapport between the participating teachers and researchers, and the teachers’ familiar- 
isation with the stimulated recall procedure, considerably reduce these influences 
and result in fuller recall commentaries (e.g., Tuckwell, 1980). 


А second set of factors, limiting what is possible for teachers to be aware of and 
talk about, may be less easily controlled. deGroot (1965), in using protocol analysis 
with chess players, and Sharp and Green (1975) and Hargreaves et al. (1975) in inter- 
viewing teachers about their teaching, point out that some areas of a person's know- 
ledge have never been verbalised and may not be communicable in verbal form. This 
tacit knowledge, which has perhaps been developed through experience, trial and 
error, may be involved in a sizeable part of the teacher's everyday cognitive activity, 
and could not be spontaneously verbalised during a stimulated recall commentary. 
In addition, for the experienced teacher, much classroom behaviour may have reached 
a level of automatisation (Argyle, 1969) in that it has become a largely automatic part of 
the teacher's classroom activity: the teacher may have long since forgotten the ration- 
ale for behaving in such a manner and the behaviour may be engaged in unthinkingly. 
Again it seems unlikely that stimulated recall could reveal thoughts which occur at a 
low level of awareness or possibly without any awareness whatever. 


Nisbett and Wilson (1977) go as far as to assert that self-reporting of higher-order 
cognitive processes is impossible and that reports so-collected are not the result of 
direct introspective awareness but the result of recalling a priori causal theories which 
the participant may regard as appropriate explanations for the outcome of his thoughts, 
and which may or may not represent the actual decision-making processes involved. 
Nisbett and Wilson suggest that people are aware of the products or outcomes of 
their thoughts but cannot adequately monitor them. То support their argument, 
Nisbett and Wilson quote examples where the monitoring of thought processes is 
clearly difficult if not impossible—for example, they invite the reader to recall their 
mother's maiden name, and then to attempt to explain how they reached their answer! 
However, one could well argue, as does Eiser (1980), that such cognitive tasks as 
remembering facts are qualitatively different from much of the decision-making of 
goal-directed, social behaviour where the participants may well be able to provide 
accounts, albeit possibly biased, partial or inaccurate, of their own thought processes. 
In this context, Schank and Abelson (1977) make a useful distinction between plans 
and scripts; they suggest that the cognitive processing of scripts is largely unconscious 
and occurs when an activity is routinised or overlearned, whereas the processing of 
plans is more conscious and deliberate and therefore more easily self monitored, 
Stimulated recall techniques may assist the researcher to gain access to the cognitive 
processing of more global units of teaching (plans) though perhaps not so easily to the 
processing involved within such units. 
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Thirdly, а set of factors which may greatly influence the nature of the data 
generated by stimulated recall concerns the ways in which the teachers are prepared 
for their commentary and how they are instructed to comment. Peterson and Clark 
(1978), for example, used stimulated recall as a means of investigating whether 
teachers’ thoughts matched a particular model of teachers’ classroom decision- 
making. Their model describes the decision-making process in terms of three levels. 
At the first level, it is hypothesised that teachers extract cues during the ongoing 
classroom activity and decisions are made concerning whether the cues are tolerable 
to the teacher; if the cues are not tolerable the second level of deciding whether the 
teacher can act alternatively is reached; and if alternative actions are available the 
final decision concerns whether or not to change the teaching behaviour. 


Twelve teachers' lessons were videotaped and four segments, chosen by the 
researchers from the beginning, middle and end of the lesson, were replayed to the 
teachers and four questions were asked during each segment to determine the level 
of decision-making reached by the teacher. Each teacher taught three lessons which 
were videotaped and the findings suggested that during the later lessons the teachers 
more frequently made ‘ fuller’ decisions—that is, they more frequently progressed 
through all three levels of decision-making. One could interpret this finding in a 
number of ways: perhaps as a result of increased confidence or increased awareness 
teachers were reporting their thoughts more fully; alternatively as a result of the 
structured interview the teachers may have learned the model, and structured their 
thoughts accordingly. The problem of respondents identifying the aims of the 
researcher and complying with them has long been discussed in the area of interview 
and questionnaire methodology (e.g., Oppenheim, 1966) and would seem equally 
significant in the use of stimulated recall Verbal reports of thoughts are easily 
influenced. 

A further problem is highlighted in the study by McKay and Marland (1978) who, 
although avoiding the imposition of any research model on the thoughts of teachers, 
provided detailed instructions of the kinds of thoughts teachers were expected to recall. 
А sample of six teachers were instructed, before teaching а videotaped lesson, to 
provide “а detailed account about: (а) thoughts, feelings and moment to moment 
reactions, and (5) conscious choices, alternatives considered before making a choice and 
the reasons for making a choice " (emphasis is theirs). After the lesson and before the 
stimulated recall commentary the teachers viewed the videotapes themselves. Although 
this technique resulted in the collection of much data (a finding at odds with research 
Where teachers were provided with less explicit and less directed instructions for their 
recall of thoughts—MclIntyre, 1977; Calderhead, 1979) one might argue that the 
explicit instructions beforehand may have influenced the teaching itself, and the pro- 
cedure may also have encouraged teachers to place a greater degree of post-hoc 
rationality upon their behaviour. 


The questions of preparing teachers for stimulated recall interviews and of 
structuring the interview itself clearly have to be weighed against the possibilities of 
imposing, or encouraging teachers to impose, unreal interpretations upon their 
behaviour. This is not to say that the use of models is inappropriate in research of 
this kind. Teachers may find it much easier to talk about thought processes if they 
know on which thoughts or types of thoughts to focus their attention. But at the 
same time, for the recall data to be of use, such a focus must not substantially distort 
or invalidate the evidence. Given the dilemma, an alternative strategy might be to 
derive a model from the teachers’ own commentaries in order to guide future com- 
mentaries. 

For example, in research on the teaching of creative writing in the primary school 
currently being carried out by the author, teachers’ lessons have been tape-recorded 
and played back to them with the instruction to report upon what was ‘ going through 
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their mind ' at that time. In these commentaries it would seem that teachers perceive 
creative writing lessons in terms of a series of stages—‘ introducing the topic stage ’, 
for example, followed by a ‘discussion and stimulating enthusiasm stage’, "а 
vocabulary stage’, * getting ideas together stage ', or a ‘ tips for writing stage’, and 
80 on. А model can consequently be derived for each teacher which can then form 
the basis of future stimulated recall interviews where the teacher is instructed to focus 
attention upon thoughts during one particular stage of the lesson. The intentions of 
the teacher, the actions by which these intentions are fulfilled and the cues to which 
the teacher is attending may then in turn be discovered and further explored in future 
stimulated recall commentaries. It is intended that such research may provide a 
fairly detailed cognitive description of teaching in a particular type of lesson, although 
the research is yet at an early stage. 


INTERPRETATION OF THE DATA 


Even when teachers’ reported thoughts have been collected, as fully as possible 
and relatively free of researcher bias, several issues still remain as to the interpretation 
of such data. Can stimulated recall reports be taken to reflect teachers’ real thoughts 
during teaching? Do teachers’ reasons for behaviour constitute explanations to the 
researcher? Clearly the validity of teachers’ recalled thoughts cannot be rigorously 
checked, although studies have examined the extent to which subjects can accurately 
recall overt ‘ checkable’ behaviours, and the inference has been made that similar 
accuracy can be achieved in the recall of covert cognitive behaviour (Bloom, 1953; 
Gaier, 1954). However, there are several possible sources of invalidity: Сајег, for 
example, suggests that participants may censor or distort their recall of thoughts in 
order to present themselves more favourably. In some cases, the extent to which 
retrospective reports of thoughts reflect real thoughts at the time may have to be 
taken largely on trust. As Radford (1974) points out, verbal reports are the only 
means by which information about experience can be obtained, and they yield data 
which would otherwise be inaccessible. This does not mean, however, that verbal 
reports of cognitive processes are always uncheckable. Some crude indication of the 
validity of reported thoughts may be obtained from their internal consistency, and the 
degree to which teachers’ accounts appear to match observed classroom practice. 

Stimulated recall provides a means of collecting teachers’ retrospective reports of 
their thought processes. In so doing it provides a source of data for researchers to 
organise and interpret. A set of categories may be developed to analyse the kinds of 
thoughts that teachers report, and such categories will reflect the interests or hypoth- 
eses of the researcher (e.g., McKay and Marland, 1978; King, 1980). However, the 
models which researchers bring to bear upon this data may differ from the interpretive 
frameworks of teachers. The researcher, in considering other sources of data, and 
problems other than those thought about by teachers, may still be interested in teachers' 
explanations of their own behaviour as data, but may ultimately be seeking explanation 
of another kind. For example, an explanation why many primary school teachers 
reserve the mornings for * formal ? work may be found not simply in teacher’s thoughts 
but in the headteachers’ expectations, the school ‘ ethos °’, environmental constraints 
such as shared books and equipment, teacher training, the teachers’ own experience 
as pupils, etc. What counts as an appropriate explanation to the researcher depends 
upon the model which the researcher adopts or constructs, which may in turn depend 
upon the purposes of the research. 

The relative advantages and disadvantages of stimulated recall techniques com- 
pared to other commentary approaches, such as that adopted by Hargreaves et al. 
(1975) in interviewing teachers regarding events observed in their classroom, have not 
been fully explored, either empirically or theoretically. In what is probably the only 
comparative study, Gaier (1954) compared university students’ success in recalling 
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overt class activities in a free recall situation with that in a stimulated recall situation 
where a videotape was periodically stopped and tbe students asked to describe what 
followed. Recall of events was found to be considerably greater in the stimulated 
recall context. At present, it js common for commentary techniques employed in 
classroom studies to be used in conjunction with other research methods. In particu- 
lar, they have been used in conjunction with classroom observation instruments, 
resulting in an act-action description of classroom processes. Several philosophical 
accounts of research on human behaviour (e.g., Kaplan, 1964; Harré and Secord, 
1972) distinguish acts (behaviour as it appears to the actor) from actions (behaviour as 
itis culturally defined); the former can only be described by self-reports, whereas the 
latter can be described by observation. An even more eclectic approach, assessing 
act and action in a number of ways, could perhaps lead to more comprehensive 
descriptions of classsroom processes and at the same time increase our knowledge of 
the relative merits and potential of alternative research methods. 


In conclusion, it seems that the method of stimulated recall provides researchers 
with a procedure, either directed or non-directed, for collecting data concerning 
teachers’ thoughts and decision-making. However, it can by no means provide a 
complete account of teachers’ thoughts, nor is the method likely to be of use entirely 
on its own. Аз with all self-report techniques, the resulting data can be influenced 
by a number of factors, some of which are not within the researcher's control. Never- 
theless, current work employing this technique will no doubt lead researchers to explore 
more fully the cognitive aspects of classroom behaviour, and perhaps to develop 
new insights into the nature of teaching. 
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REPORTING STUDENT ACHIEVEMENT: 
HOW MANY GRADES? 


Bv M. C. MITCHELMORE 
(University of the West Indies, Kingston, Jamaica) 


SuMMARY. This paper presents a scientific rationale for deciding the maximum number 
of points on a grading scale used for reporting student achievement. It is first proposed 
that а scale be considered acceptable if a hypothetical retest would lead to at least 
90 per cent of the students being regraded within one point of their original grade. The 
proposed criterion is then used to derive a formula for the maximum number of grades 
in terms of the standard error of the test score in the special case where the standard 
error is constant and the grade boundaries are equally spaced, and applications to more 
general cases are considered. The method is illustrated Бу applying it to three typical 
assessment situations, in which the estimated maximum acceptable number of grades is 
found to vary from 3 to 9. 


INTRODUCTION 


THE number of grades used for reporting student achievement varies greatly. In the 
United States, most schools and universities use a 5-point scale (А, B, C, D, F) 
but many courses operate a simple pass-fail system. In England, many O-level 
examining boards used to use a 9-point scale, but now they use only six points (А, 
B,C,D, E, U). In Jamaica, the grading systems at the University of the West Indies 
range from a 3-point to а 13-point scale and the teachers’ colleges use а 15-point 
scale. 


What are the merits of these various systems ? There appears to be no well-known 
rationale for deciding the number of points on a scale, so that institutions are forced 
to depend on the intuitions of their staff members and the authority of those who are 
most experienced in evaluation matters. It is consequently difficult to make any 
adjustments in an institution's traditional grading scale, because no rational argu- 
ments can be put forth to either attack or defend the existing system. For example, 
there would be many advantages in having a uniform grading scale in all the faculties 
of this university; but in the absence of any clear rationale for choosing the best 
number of points on a grading scale it becomes impossible to reach any decision as 
each faculty emotionally defends its own intuitions and traditions and rejects any 
compromise. 


The present paper presents a scientific rationale for deciding the number of points 
to use on a grading scale in any given assessment situation. First, a criterion for an 
acceptable scale is proposed; it is then shown how this criterion can be used to deter- 
mine the maximum number of scale points from the standard error of the assessment 
score. Finally, the rationale is applied to two common methods of assessment 
(multiple-choice and essay tests) and an example of a composite assessment. 


We shall restrict our discussion to cases where the assessment leads to a numerical 
score, and the problem is to construct an acceptable rule for converting scores to 
grades. This includes the case where several items (e.g., the questions on an examina- 
tion) are independently graded and the grades averaged, since the grades have to be 
converted to numbers to find a total score before the * average grade ' can be decided. 
Although there has been a considerable amount of research aimed at deciding the 
optimal number of points on a grading scale used to rate individual items (Lissitz and 
Green, [975), only one brief, obscure paper addressing the wider problem has been 
identified (Ward, 1972). 
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A RATIONALE FOR DECIDING THE NUMBER OF GRADES 


А proposed criterion 

Even the best-designed assessment does not give an exact measure of students' 
ability in the area tested. Every educator knows that a different task or a different set 
of questions, a different marker, a different test atmosphere, or different personal 
circumstances, can alter a student's score to a considerable extent. So although 
percentage scores are still regarded as indispensable by many teachers and adminis- 
trators, they can be grossly misleading in their implied accuracy. Consequently, 
there has arisen a universal practice of grouping scores to form grades; for example, in 
some institutions, a grade of B+ covers the range 77 per cent to 82 per cent. The use 
of grades instead of percentages reduces the number of potential scale points from 
101 (0 per cent to 100 per cent) to 15 or less. 

The major reason for reducing the number of scale points is to make it less likely 
that inevitable variations in the testing situation (e.g., a different set of questions) 
would have an effect on the results of the assessment. For example, we would пої be 
in the least surprised if a student who scored 79 per cent on an examination scored 
anywhere between 75 per cent and 85 per cent on a different paper covering the same 
topics (or even on the same paper); but we would hope that a student graded B+ in 
one assessment would be graded B+ in any other similar assessment of the same work. 


However, the uncertainty in assessments cannot be entirely eliminated by reducing 
the number of points on the grading scale. Whatever the scale, there will always 
be several students who fall near the borderlines and who are therefore likely to be 
graded differently on a retest. Consequently, we will never be able to ensure that 
students are always going to be graded the same way on a retest; a variation of one 
scale point either way seems quite likely—and acceptable. Jt is therefore proposed 
that the number of points on a grading scale be considered acceptable if it is almost 
certain that, were an assessment to be repeated, each student would be regraded 
within one scale point of the original grade. 

The next step is to decide what * almost certain” means. We shall take it to 
mean а mathematical probability of atleast 90 per cent. This is a somewhat arbitrary 
figure: а confidence level of 95 per cent is frequently used in educational measure- 
ment, but it is felt that a somewhat lower level is admissable in many of the situations 
we have in mind. (Use of the 95 per cent level would lead to a rather smaller number 
of scale points). The 90 per cent probability is to be understood as an average over 
all students and all testing occasions; the probability of any one individual being 
regraded within one scale point obviously depends on the score obtained. 

The proposed criterion is summarised in Figure 1. 

In a given assessment situation, the maximum reasonable number of grades can 
be found by setting the probability exactly equal to 90 per cent. Such grades would 
then give the greatest possible amount of reliable information about students’ achieve- 
ment, and it may be argued that this is the major purpose of a grading system. How- 
ever, there may be situations where such fine distinctions are not desirable; in that 
case, the probability of regrading within one scale point would exceed 90 per cent. 


FIGURE 1 
A PROPOSED CRITERION FOR JUDGING THR ACCEPTABILITY OF A GIVEN GRADING SCALE 





A grading scale is acceptable in a given assessment situation if the 






average probability of a student being regraded within one scale point 






of the original grade on a parallel assessment is at least 90%. 
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A useful corollary 

One might also require of an acceptable grading scale that students with different 
grades should be different in their actual levels of achievement. However, the 
uncertainty in measurement makes this impossible; two students of equal ability near 
а borderline will frequently be given different grades. The most we can ask is that 
students differing by two scale points should be clearly different. We now show that 
this last condition is automatically satisfied by any grading system which satisfies the 
criterion in Figure 1. 

A student's ‘ actual achievement level’ is a fiction. It is not given by any one 
assessment, but by the average (arithmetic mean) of the student's scores on all possible 
parallel assessments which could be constructed to measure achievement in the given 
course. If the criterion is satisfied, the probability that the scores obtained by the 
same student on two different assessments would differ by at most one scale point is 
atleast 90 per cent. Clearly, the same probability would apply if a different student of 
an equal achievement level was given the second assessment, or if the two students 
were given the same assessment. "There is thus a probability of at least 90 per cent that 
students of the same achievement level will be graded within one scale point of each 
other on any assessment. This makes it unlikely that students whose grades differ 
by two or more scale points are of equal achievement levels; statistically speaking, we 
would reject the null hypothesis of equal achievement at the 10 per cent level of 
significance. 

APPLYING THE PROPOSED CRITERION 
Standard error 

Ап important theoretical concept in the present context is the standard error (SE) 
of an assessment score; the SE 1s the standard deviation of the set of scores that a 
student would obtain on all imaginable parallel assessments. In what follows, we 
shall make the classical assumption that the frequency of various scores shows a 
normal distribution; the mean is equal to the theoretical achievement level (also called 
the student's true score) and the standard deviation is equal to the SE (see Figure 2). 


FIGURE 2 
THEORETICAL DISTRIBUTION OF А STUDENT'S SCORES ON ALL IMAGINABLE PARALLEL ASSESSMENTS 


Frequency 


х Xa ху x, х Actual 
test 
score (x) 
Theoretical 
achievement 
level 


(true score) 
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There is obviously a relation between the SE of an assessment score and the 
number of points on an acceptable grading scale: the smaller the SE, the more 
accurate is the assessment, the finer are the distinctions that can be made, and the 
greater is the number of points on an acceptable grading scale. We shall now investi- 
gate this relation in more detail; since the SE of any given assessment can be calcu- 
lated theoretically or estimated empirically, the results will enable us to determine 
quite objectively the maximum acceptable number of scale points which satisfies the 
criterion in Figure 1. 

In Figure 2, the points ...X..2, X_1, Xo, X1, Хо, ... represent the boundaries between 
one grade and the next. A student's second score is within one grade of his first score 
if the first score lies between x, and x,,, and the second score lies between х;_; and 
%,+2, for any value of i. The probability of this happening is 


р = Я Рщ<х< м. 1).Р-1<х< жа), 


which can be easily calculated using tables of the normal distribution from the true 
score, the SE and the {х}. 

Further investigation is complicated by the fact that the student's true score is 
unknown, the SE varies from one true score to another, and there may be many (х,) 
which are acceptable. We shall first examine a theoretical special case and then 
consider how the result can be applied in practice. 


A theoretical special case 

As a special case, let us assume that (1) the SE is constant throughout the range 
of scores, and (2) the grade width is constant, i.e., the x; are equally spaced. Then 
using Formula (1), p can be calculated for any given ratio of the grade width to the SE 
and any given position of the true score in relation to the grade boundaries. It is 
found (by trial and error, using a computer) that when the ratio of grade width to 
SE is 1-635, the probability of the second score falling within one grade of the first 
Score varies from 90-2 per cent (when the true score falls on a grade boundary) to 
89-8 per cent (when it falls exactly in the middle of a grade interval), but that the 
average value, assuming that true scores are uniformly distributed throughout the 
interval, is 90-0 per cent. In other words, the proposed criterion will be satisfied if 
the grade width is 1-635 times the SE. 

The average probability is only moderately sensitive to variations in the ratio of 
grade width to SE. A ratio of 1-47 (10 per cent less than 1:635) gives an average 
probability of 86:5 per cent, whereas 1:80 (10 per cent greater) gives 92-7 per cent. 
On the other hand, the range of probability associated with each ratio (varying accord- 
ing to the position of the true score) is quite small; it increases as the ratio increases, 
but the range is still only 0-7 per cent when the ratio is 1:80. The average probability 
will therefore be quite insensitive to variations in the distribution of true scores, so the 
previous assumption that true scores are uniformly distributed is not really necessary. 

We summarise the basic result in Figure 3. 


FIGURE 3 
À PROCEDURE FOR FINDING AN ACCEPTABLE GRADING SCALE IN A SPECIAL CASE 






In the special case where the standard error is constant throughout the 





range of scores, an acceptable grading scale can be obtained by using 
grade intervals of a constant width no greater than 1.635 times the 
standard error. 
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Practical applications 

The SE of a test score is inevitably lower for extremely high or low scores due to 
* ceiling ' and ‘ floor’ effects, but it seems to vary little in the middle range of scores. 
For example, using the formula given by Lord (1955), the SE of scores on a multiple- 
choice test of any number of items scored 0 or 1 is found to vary as shown in Figure 4; 
the SE is within 10 per cent of the maximum for true scores between 28 and 72 per cent, 
and within 20 per cent of the maximum for true scores between 20 and 80 per cent. 
It seems reasonable to assume that, for most tests, variations in SE would be negligible 
for true scores in the range 30 to 70 per cent and minor in the range 20 to 80 per cent. 


FIGURE 4 
STANDARD ERROR OF SCORES ON A MULTIPLE-CHOICE TEST 


100 


Standard error 
(% of maximum) 
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0 20 40 60 80 100 
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Once the reliability, г, of a test is known, the SE is normally calculated using the 


formula 
SE = s/(1—n), 


where s is the standard deviation of the actual scores of the subjects who took the test. 
This value is in fact the root-mean-square average of the SEs of the individual testees 
(Lord, 1955), and does not allow us to find the SE corresponding to each true score. 
However, if the actual scores are mostly in the middle range, the SEs in this range 
will not be greatly different from the average value. 

In any given practical application, where the set of actual scores and an estimate of 
the average SE are known, one of the following three approximate methods can there- 
fore be used to find an acceptable grading scale with the maximum number of points. 


Method 1. If most actual scores are in the middle range, assume that the SE is 
constant and equal to the given average value. Then divide the score range into a set 
of grade intervals of constant width 1:635 SE. Because the SE near 50 per cent is 
actually greater than the assumed value, the probability of students with scores near 
50 per cent being regraded within one grade point will be slightly less than 90 per cent; 
similarly, the probability for students with high or low scores will be slightly more 
than 90 per cent; but the average probability should still be near 90 per cent. 


Method 2. Alternatively, assume that the SE follows the same pattern as in 
Figure 4, and find the maximum SE so that the average of the SEs of the given set of 
testees is equal to the given average SE. (One will have to use actual scores as esti- 
mates of true scores.) Then calculate tbe SE for various true scores and construct a 
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set of grade intervals such that the width of each interval is equal to 1:635 times the 
SE at the centre of that interval. The intervals will get smaller as scores get further 
from 50 per cent, but the probability of being regraded within one grade point will be 
about 90 per cent for all scores. 

It may be argued that the use of the factor 1-635 is invalid since its calculation 
involved the assumption that the grade widths were equal. However, when the grade 
width is about 1-635 SE, only the middle three intervals (x... to x? in Figure 2) contrib- 
ute to the calculation of P; the probabilities of scores in the tails are negligible in 
comparison. One would therefore expect to get a probability very close to 90 per cent 
whenever the widths of the given interval and the two adjacent intervals are approxi- 
mately 1-635 SE, even if the more distant intervals are considerably larger or smaller. 
As long as we keep to the range of scores within which the SE does not vary rapidly, 
the widths of any three adjacent intervals will be approximately equal. 

Method 3. ЇЇ neither equal-width nor equal-probability intervals are acceptable, 
or if there are many extreme scores, then one can only proceed by trial and error. One 
must first assume a constant value for the SE or estimate it for various true scores 
using Figure 4. For any suggested grading scale, the probability of being regraded 
within one grade point can then be calculated using Formula (1). If this probability 
is too low, the scale can be changed by varying the boundaries or (more likely) 
reducing the number of scale points until the probability exceeds 90 per cent. 


SOME EXAMPLES 


We now consider three typical methods of assessment in order to get a feel for the 
number of points which an acceptable grading scale is likely to have. We shall apply 
our method to obtain a grading scale of the type most frequently used in this country 
(Jamaica), namely grade intervals of equal width between ‘ pass’ and ‘ distinction ’ 
levels, without any differentiation below pass or above distinction. For the sake of 
definiteness, we shall take the pass level as 40 per cent and distinction as 85 per cent. 
We shall also make the apparently reasonable assumption that the SE is approximately 
constant between 40 per cent and 85 per cent, so that we can use Method 1 above and 
calculate the grade width simply by multiplying the SE by 1-635. 

Although many institutions use different pass or distinction levels, or prefer non- 
uniform grade intervals, it is unlikely that the number of points on an acceptable 
grading scale would be much different from those found below. Whether any given 
scale is in fact acceptable can easily be checked using one of the methods outlined 
above. 


Multiple-choice tests 

Multiple-choice tests are the easiest to consider first because they have been 
subject to much theoretical investigation, the results of which are generally accepted. 
The SE of a properly-constructed multiple-choice test depends only on the number of 
items, not on the actual items. Lord (1959) found and McMorris (1972) confirmed 
that the average SE of a score on a test of moderate difficulty is approximately 
0:432. /n, where n is the number of test items. The minimum grade width between the 
pass and distinction levels is thus 1-635 x 0-432. /п, which is very close to ./(4n). 

Consider a 50-item multiple-choice test, which is long enough to give a valid 
evaluation of many courses. The minimum grade width is 4/25, that is 5. Table 1 
shows one possible score-grade conversion; some unevenness in the grade intervals is 
almost inevitable—we have chosen to reduce the length of grade intervals 2 and 6 
because the SE is likely to be lowest there. The result is a 7-point scale. 

The reader may verify that for 20- and 100-item multiple-choice tests, the maxi- 


mum acceptable numbers of points on a grading scale of the type described are 5 
and 9 respectively. 
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TABLE 1 


Ам ACCEPTABLE SCORE-TO-GRADE CONVERSION 
FOR A 50-IrzM Mur rIPLE-CROICE TEST 


Score range Grade (scale point) 





43-50 (distinction) 1 
39-42 2 
34-38 3 
29-33 4 
23-28 5 
20-23 6 

0-19 (fail) 7 





Essay tests 

When we turn from multiple-choice to essay tests we meet two complications: 
(1) the score obtained varies from marker to marker as well as from test to retest, 
and (2) there is no theoretical way to calculate the SE. We shall now attempt to use 
available empirical results to estimate the SE for some typical essay tests. 


There has been a considerable amount of research on the grading of essays, 
summarised by Coffman (1971) as follows: 


The accumulated evidence leads . . . to three inescapable conclusions: (a) different 
raters tend to assign different grades to the same paper; (5) a single rater tends 
to assign different grades to the same paper; and (c) the differences tend to 
increase as the essay question permits greater freedom of response (p. 277). 


Coffman further reports that the correlations between markers, as reported in the 
research literature, vary from 0-98 to 0:35. If we assume that a test marked out of 100 
has a standard deviation of 15, Formula (2) above enables us to estimate that SEs due 
to marking typically range from 2 to 12. Essay tests in well-defined areas such as 
mathematics would probably have SEs at the lower end of this range; the higher SEs 
would be associated with more creative writing, as in English composition. 


The second major source of variability in assessments obtained by essay testing 
lies in the choice of questions for the test paper. Because of the length of time 
required to compose the responses, an essay test necessarily has a small number of 
questions (rarely more than ten, although they may be selected from a large number of 
alternatives); the actual questions answered therefore form a small sample—some- 
times a very small sample—of all the possible questions which could be asked. An 
inevitable consequence is that a re-assessment, using different questions, could produce 
entirely different results. Students whose favourite areas all came up on the first 
paper might have problems finding enough * easy ' questions on the second paper, and 
general difficulty levels would almost certainly be different. Of course, these factors 
also affect assessment by a multiple-choice test, but to a much smaller extent because 
of the larger number of items which can be answered in the same time. 


Research on the effect of changing the questions on an essay test is much rarer, 
perhaps because the investigator must arrange for at least two papers to be set, sat and 
marked in a situation where dealing with one paper is quite enough work for staff and 
students alike. Three studies have been located, all concerned with first-year work at 
British universities examined by three-hour essay tests scored out of 100. Their 
results indicate SEs of 10 for physics examination (Black, 1963), 9 for an electronics 
examination (McVey, 1972), and from 7 to 14 with a median of 10 for mathematics 
examinations (Hill, 1976). All these figures include the effect of variation among 
markers; only McVey separated the two effects, finding that the elimination of marking 
variation would only have slightly reduced the SE. 
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From the empirical results cited, it would appear that an SE of about 10 due to 
variation in questions could be expected in an essay test marked out of 100; the SE 
due to marking variation could range from 2 to 12. Taking both effects together, we 
might expect the SE to vary from about 10 in subjects with well-defined criteria of 
correctness to about 16 in areas where evaluation is highly subjective. (The SE for the 
two scores combined, assuming that they are independent, is the square root of the 
sum of the squares of the two SEs.) For a test with SE 10, the minimum grade width 
should be about 16; this value leads to а 5-point scale. For a test with SE 16, the 
grade width is 25 and nothing finer than a 4-point scale is acceptable. 

A shorter essay test scored out of 50 is likely to have an SE of rather more than 
half the SE of a test scored out of 100, as with multiple-choice tests; the SE would 
probably lie between 7 and 11. Such essays could be acceptably graded using 3- or 
4-point scales. 

Table 2 summarises the results obtained so far. 


TABLE 2 


ESTIMATED SEs AND MAXIMUM ACCEPTABLE" NUMBERS OF GRADES 
FOR SELECTED MULTIPLE-CHOICR AND Essay 





KON о Estimated Maximum acceptable 
Type Length SE number of grades 
Multiple-choice 20 items 19 5 
50 items 34 7 
100 items 43 9 
Essay with 
fairly objective 14 hr/50 marks 7 4 
scoring 3 hr/100 marks 10 5 
Essay with 
highly subjective 14 hr/50 marks 11 3 
scoring 3 hr/100 marks 16 4 


* According to the criterion stated in Figure 1, assuming a pass mark of 40 
cent, a distinction mark of 85 рег cent, and approximately equal grade 
intervals between these two scores. 


А composite case 

In general, the score from a multiple-choice test on an essay test is not the sole 
measure of student performance; a weighted total of scores on several tests, or on 
theory and practical examinations, is frequently used. The SE of the total score can 
be found empirically by carrying out two parallel assessments, or it may be estimated 
from the SE of the various components as illustrated below. 

Consider the case where a course is evaluated using five 20-item multiple-choice 
quizzes and a three-hour final essay examination which is given twice the weight of the 
quizzes. If the quizzes cover separate areas of the course, they are equivalent to a 
single 100-item test; the SE of the total quiz score would thus be about 4-3. The 
essay examination might be marked out of 100, and have an estimated SE of 11 
(taking ап average value); the score would then be doubled to give the desired weight- 
ing and the SE of this component would be 22. Assuming a moderate correlation 
of 0-6 between the two quiz and the essay scores, the SE of the overall score, out 
of 300, would then be about 25. (The SE of the sum of two correlated test scores is 
к) (52-52 — 215482), where 51 and вг are the SEs of the two scores апа г is the correlation 
between them.) The minimum grade width is therefore about 41, leading to a scale 
with at most 6 points. 
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]t may be noted that the inclusion of an essay-test component, possibly with a 
view to obtaining а more valid assessment of achievement on the course, reduces the 
maximum number of scale points below the number which would be acceptable if 
only the multiple-choice tests were used. "This is a result of the greater accuracy of 
multiple-choice tests in comparison to essay tests. 


CONCLUSION 


A rationale for finding the maximum number of scale points, based on a clear 
criterion of an acceptable grading system, has been presented. Its application in a 
given evaluation situation requires an estimate of the standard error of the scores on 
the assessment, which may be found directly through a special empirical study or 
estimated by the method illustrated above using results from previous empirical studies. 


In the supposedly typical cases examined, the maximum acceptable number of 
scale points likely to be used in practice varied from 3 for a single 14 hour essay test 
with highly subjective marking to 9 for a single 100-item multiple-choice test; а 
combination of five short multiple-choice tests and a final essay examination led to a 
6-point scale. It is therefore not possible to make any general statement as to the 
number of grades which can be used in reporting student achievement; the appropriate 
number must be decided in relation to the specific method of assessment used for each 
course. The number of scale points actually used will also depend on the scores agreed 
as indicating pass or distinction performance, and the purpose of the assessment report. 


Once the appropriate number of scale points for a given assessment situation has 
been decided, it is still necessary to decide what to call the various grades. Apart from 
considerations of tradition, there seems to be no reason to favour letters or numbers. 
The interpretation of each grade depends on the use to which the results are to be put. 
The lowest grade will generally indicate outright failure and the highest distinction, 
but the matter of whether to regard the intermediate grades as varieties of pass or to 
give them finer interpretations (e.g., from ‘ good’ to " poor’) can only be decided in 
context. In cases where it is unreasonable to use fixed percentage pass and distinction 
marks, it will generally be beneficial to state the interpretation of each grade as clearly 
as possible. 


The rationale for deciding the number of points has been presented in relation to 
the overall assessment for a course. It could also be applied to the individual com- 
ponents of the assessment, but it would be dangerous to do so. In the first place, the 
maximum acceptable number of scale points would differ from one component to the 
next. Secondly, there would be a temptation to calculate the overall grade by taking a 
weighted average of the separate grades; this would introduce rounding errors and 
inevitably lead to several students near the borderlines being given the wrong grade. 
Individual items (questions or tasks) should be graded using the largest possible 
number of interpretable scale points, either numerical or literal, and a total score for 
each component calculated. Students may be told what range of scores roughly 
corresponds to each scale point in the final grading system, but no conversion to 
grades should be done. The scores on each component, suitably weighted, should then 
be added to obtain a total assessment score and only then should scores be converted 
to grades. 


The problem of finding а uniform grading system in an institution where many 
different types of courses are taught remains a formidable one, since assessment in 
Science subjects seems to lead to a greater acceptable number of scale points than in 
arts subjects. Judging from Table 2, а 5-point scale might be а reasonable com- 
promise; this scale would probably be justified (according to our criterion) for assess- 
ments made on the basis of about 6 hours of essay testing in a highly subjective 
area, by about 3 hours of essay testing in a more objective area, or by a 20-item 
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multiple-choice test. Wherever multiple-choice tests are used as measures of achieve- 
ment on a substantial proportion of the course objectives, grading on a 5-point 
scale will be highly reliable and an essay-test component can be included without fear of 
reducing the accuracy of assessment below the agreed level. 

The estimates given in this paper are, in the case of essay tests, based largely on 
studies of university examinations in scientific subjects. There is a need for further 
empirical research into the standard error of essay tests in other subjects and at other 
levels. It is possible that the use of several shorter, more structured essays might lead 
to more accurate assessments than our examples suggest. 
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INDIVIDUAL DIFFERENCES ATTRIBUTED TO 
SELF-CORRECTION IN READING 


Bv G. B. THOMPSON 
(Department of Education, Victoria University of Wellington, New Zealand) 


SUMMARY. Апа aporia was made of data reported by Clay (1969) as evidence for individual 
differences in self-correction behaviour in children's reading. It was found that the index of 
incidence of self-correction behaviour used by Clay had several properties which could not be 
justified and no other measure from the data could support the original conclusions. 


In a widely quoted study by Clay (1969), data were reported on individual differences in 
the incidence of self-correction of oral reading errors of 5-year-old children during classroom 
reading. A measure of the incidence of self-correction was obtained from oral responses to 
basic reading books during the first year of schooling. The measure was found to be posi- 
tively correlated with reading attainment progress levels as tested at the end of that year. 
The interpretation given was that children who by this measure exhibited a higher level of 
self-correction were making greater use of spontaneous * checking processes ', and that such 
processing is relevant to the efficient acquisition of reading skill. The purpose of this paper 
is to examine the data as originally reported, and critically consider the grounds for the 
measure employed and hence the interpretation given. 

Clay (1969, p. 48) classified the reading responses to each word into one of the following 
(mutually exclusive and exhaustive) categories: 


9 correct responses, no error having occurred 


(b) uncorrected errors 
(c) self-corrected errors 


Expressed as proportions, p(x), of the total responses to all the words which were read, 
р(а)+р(Б)+р(с) = 1, OSp(x)S!. 

Clay presented her data (p. 50, Table 2) in terms of ‘ rates °, which were the reciprocals of 

p(x) However, p(x) will be used here as it simplifies the exposition and in no way affects the 

arguments which follow. 

To examine individual differences Clay divided her sample of 100 children into quartile 
groups, called progress groups, according to the reading attainment test levels obtained at the 
end of the school year. Individual differences in incidence of self-correction behaviour were 
described in the form of comparisons between the median cases of these progress groups. 
Clay used the following index to measure the incidence of self-correction behaviour: 


__БО__ 

p(c) + p(b) 
where p(c) is the proportion of responses which were errors self-corrected by the child, and 
p(6) the proportion of responses which were errors uncorrected by the child. The obtained 
values of this self-correction index for the median case of each progress group are given in 
Table 1. Clay did not present data on the simple proportion of self-corrected errors, p(c), 
but this can be calculated, as values for p(c)+p(b) are available from Clay's reported data. 
The proportion of uncorrected errors, p(b), was also calculated from the data. These values 
are given in Table 1. 

It will be apparent that the simple proportion, p(c), of self-corrected errors varies little 
between the progress groups and the proportion is certainly no greater in the High progress 
than the Low progress group. In contrast, consider the index of incidence of self-correction 
employed by Clay (Table 1). Clearly the values of this index decrease from the High to the 
Low progress groups, but this is only to be expected as the function 


k 
k+p(b) 
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TABLE 1 
MEASURES FOR THE MEDIAN CASE or EACH READING PROGRESS GROUP 


Self-correction Proportion of 





index self-corrected Proportion of 
Reading progress р(с) errors uncorrected errors 
groups (Quartiles) p(c)-- p(b) P(c) Pb) 
High 0:364 0 010 0-017 
High Middle 0-263 0-017 0-049 
Low Middle 0-120 0-015 0-112 
Low 0-051 0-020 0-368 


will decrease as p(b) increases, where k is any constant. Thus the obtained variation in this 
self-correction index is merely a reflection of the wide variation in the proportion, p(b), of 
uncorrected errors between the progress groups, not in the incidence of self-corrected errors. 


Clay does not give a rationale for the index of self-correction. The only apparently 
plausible reason which can be imagined is that the index was intended to provide an adjust- 
ment which takes account of the variation between individuals in the opportunity for making 
а self-correction response. Presumably the argument would be that when more errors occur 
there is more opportunity for self-correction. As p(c)+p(b) represents the proportion of 
errors, corrected as well as uncorrected, by this argument it would be appropriate as the 
denominator of the self-correction index. But this argument fails to consider that just as 
uncorrected errors represent opportunities for overt self-correction which were not taken up, 
so also do initially correct responses. They represent opportunities not taken up, either 
because no self-checking was required (a correct response was readily available) or because 
a ae HE did occur, although not overtly and was thus not observable as self-correction 

viour. 


Moreover, any rationale for the index must provide some interpretation of the arith- 
metical properties of the index. One such property is that if the proportion, p(6), of un- 
corrected responses approaches zero, then the index approaches the maximum value of unity, 
and this will be so whatever the proportion, ріс), of self-corrected responses. In such cases 
the index will remain near tbe maximum value even though the simple proportion, p(c), of 
self-corrections varies through the range from unity to a value approaching zero. Such a 
property is inexplicable in an index of the incidence of self-correction. 


The conclusion is that the index of the incidence of self-correction behaviour used by 
Clay has not been justified and has given an indication of a higher incidence of self-correction 
behaviour for the High progress than the Low progress group, where no such indication is 
warranted. The data do not provide evidence for the individual differences which Clay 
attributed to self-correction in children's reading; nor do the data provide support for the use 
of self-correction in the diagnosis and remediation of reading difficulties (Clay, 1979, pp. 14, 
59; Glynn et al., 1979, р. 12). Moreover, the problems discussed here also apply to the use 
of the same index of self-correction behaviour in recent research, such as Donald (1979). 


It should be pointed out that what has been said here does not deny the possibility that a 
process of self-checking or self-editing is important to the reader in the acquisition of reading. 
Such processes may not be overt and thus not directly observable. As the data examined 

give only direct measures of overt * reading bebaviour ', they provide no grounds for rejecting 
such a possibility; nor have they provided support for the possibility. 
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THE ' BOTTLES TEST': А QUICK, CONVENIENT, ALTERNATIVE 
PROCEDURE FOR ASSESSING CONSERVATION OF LIQUID QUANTITY 


Ву HAZEL BENNER AnD К. WHELDALL 
(Department of Educational Psychology, University of Birmingham) 


SuMMARY. А sample of 40 six- to seven-year-old children were tested for conservation of liquid 
quantity using traditional Piagetian procedures involving pouring to achieve the transformations. 
These results were then com with those subsequently obtained using an alternative pre- 
cedure in which sealed bottles were inverted to achieve similar transformations. Very high 
correlations between the results of the two procedures were obtained for both conservation 
of equality and of жеңүү The ‘Bottles Test’ is therefore suggested as а quick, convenient, 
alternative to conventional liquid conservation assessment procedures. 


INTRODUCTION 


It would be pointless to speculate on the number of psychologists, educational re- 
searchers, teachers and others who have, over the years since Piaget pioneered his conser- 
vation assessment strategies, spent countless hours pouring liquids from one container to 
another in assessing generations of children. Whether this time and effort bas been well spent 
is debatable as is the import of the massive theoretical edifice Piaget and his collaborators 
have constructed on the basis of what many would argue are, at the very least, methodologi- 
cally suspect procedures. Indeed, the tightening up of Piagetian paradigms has become a 
major growth area within developmental psychology. Most of those who have ever engaged 
in testing children using Piaget’s procedure for the assessment of conservation of continuous 
quantity (liquid) would, however, regardless of theoretical persuasion, probably agree that 
the procedure is very time-consuming and rapidly becomes very tedious. Liquids have to be 
carefully measured, poured, adjusted to suit the individual child and then poured again to 
achieve transformations. 

Our own interest in the Piagetian conservation assessment paradigm arose as a result of 
our questioning his verbal assessment procedures. We argued, and subsequently demon- 
strated (Wheldall and Poborca, 1980), that Piaget’s procedures confounded logical skill with 
skill in receptive language and that a non-verbal assessment procedure yielded higher 
‘conservation pass rates than the traditional verbal interrogation procedures. In the context 
of an independent replication and extension of these findings, in which the first author of this 
paper replaced Dr Poborca as the tester (reported in Wheldall and Benner, 1981), we 
explored, as a subsidiary follow-up, alternative verbal assessment procedures. What we were 
attempting to devise was a quicker, more efficient means of verbally assessing conservation 
but which fully incorporated Piaget’s criteria for demonstrating conservation. We were 
particularly anxious to eliminate pouring as the means of achieving transformations since 
spillage (or surreptitious addition/subtraction) is at least possible and children thus tested 
may potentially focus on this feature. It is also time consuming and inefficient. Conse- 
quently we devised two alternative strategies—the ‘ bags test’ and the ‘ bottles test’, the 
latter of which we prefer as the more practical and simpler procedure. 

In the * bags test’, two plastic bags of the same size are filled with equal quantities of 
liquid (thus displaying equal liquid levels) and are suspended from laboratory clamp stands. 
Pressure from a board at the back of one bag will cause the liquid level to rise, whilst lowering 
a bag to a surface will allow the liquid to spread out, thus lowering the liquid level. Achieving 
conservation of inequality tasks by this method was, however, problematic and our attempt 
proved methodologically unsatisfactory; consequently we are not reporting these results 
here (see Benner, 1980, however, for full details). In comparison, the bottles test described in 
this study was simple and effective. The use of sealed bottles cantaining measured quantities 
of liquid, which may be inverted to achieve transformations, constitutes a quick, convenient 
alternative procedure for assessing conservation of liquid quanzity. 


METHOD 
Sample 


The subjects consisted of 40 children randomly selected from the population of six- to 
seven-year-olds attending a rural primary school. The mean chronological age of the group 
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was 6:5 (SD 3:35 months) and the mean vocabulary age, as assessed by the English Picture 
Vocabulary Test (Brimer and Dunn, 1963), was 7:2 Gb. i 19 monike), Equal numbers of 
boys and girls were included. 


Design 
The children were tested on a traditional Piagetian liquid conservation task in the con- 
text of an experiment comparing verbal and non-verbal approaches to assessment as pre- 
viously mentioned (Wheldall and Benner, 1981). Each child was tested twice on the standard 
‘traditional’ verbal conservation task, both before and after testing, using the Wheldall 
and Poborca (1980) non-verbal assessment procedure. Alternative testing using bottles was 
carried out immediately following the second verbal conservation test, and the results obtained 
were compared with those for the previous testings. 


Materials 

The materials employed in the traditional verbal test of conservation were similar to 
those employed by Wheldall and Poborca (1980), by which the child’s performance on three 
different tasks was used to assess separately conservation of equality and inequality. For 
equality, one of two equal quantities of coloured water in identical cylindrical containers was 
" transformed " by pouring into: 


e a taller, narrower container 
(b) a shallower, wider container 
and (c) a set of small containers. 


Assessment of inequality began with unequal quantities in identical cylindrical containers 
one of which was then similarly transformed with the result that, despite the unequal volumes, 
the liquid levels now appeared equal. 

The alternative testing procedure employed four matched pairs of sealed bottles. The 
‘Bottles Test’ consisted of four tasks, two testing conservation of equality and two of 
inequality. These simulated transformations (a) and (b) described above. It was not possible 
to simulate transformation (c). For this reason and to ensure direct comparability of results, 
the two items involving transformation (c) in the traditional verbal test were omitted from 
the analysis for this study. 


Procedure 

Each child was seen on two occasions. In the first session (s)he was tested on the EPVT 
and given the first verbal test of conservation of liquid quantity. In the second session follow- 
ing non-verbal training and testing. (described elsewhere in Wheldall and Benner, 1981) the 
child's conservation ability was again tested verbally. This was followed immediately by the 
alternative testing procedure—the * Bottles Test ’. 

In the verbal assessment procedure the traditional tasks were set as described earlier 
and detailed in Wheldall and Poborca (1980), the following question being posed both before 
and after transformation: ‘ Do these two glasses have the same amount of water, or, does 
this one have more water in it, or does this one have more water in it?" (Henceforth to be 
known as ‘ the experimental question.") 

The tasks involving sealed bottles were included to assess whether pouring liquid from 
one end of the bottle to the other (i.e., in an enclosed space without the possibility of any loss) 
was a more easily understood transformation than freely pouring from one container to 
another. In two of the bottle tasks (1 and 3) the pair of bottles contained equal quantities 
of water. Theliquid in one of the pair was then transformed by upturning the bottle. Bottle 
task 1 presents tbe same transformation as occurs when liquid is poured from one container 
into a taller, narrower one; i.e., a rise in liquid (transformation (a)). Bottle task 3, since the 
bottles are initially presented upside down, demonstrates the same transformation as occurs 
when liquid is poured from one container into a wider, shallower container; i.e., a fall in 
liquid level (transformation (b)). 

Inthe other two bottle tasks (2 and 4) the pair of bottles contained different quantities of 
water which had been carefully selected to ensure that although the liquid levels differed when 
initially presented, when the transformation was effected by upturning the chosen one of the 
pair, the liquid Ісусі in the two bottles would then be equal. In one of these tasks of conser- 
vation of inequality using bottles (bottle task 4), the transformation caused the liquid level to 
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rise (transformation (а)), whereas in the other task (bottle task 2), where the bottles were 
initially presented upside down, the transformation caused the liquid level to fall (transfor- 
mation (Б). As it did not seem possible to test conservation following division of tbe liquid 
using bottles, there were only four tasks, two of which assessed conservation of equality and 
two which assessed conservation of inequality. 

Following the second ‘ traditional’ (verbal) test of conservation, the experimenter said 
“ Now let's look at some bottles" Then followed the four tasks involving sealed bottles 
containing coloured water. In each set the water was coloured vividly and differently in an 
attempt to maintain interest. Throughout the four tasks, a modified version of the ‘ experi- 
mental question ' (| experimental question-bottles ") was used—‘ Do these two bottles have 
the same amount of water, or does this one have more, or does this one have more?" 
Without exception, all subjects said that both bottles contained “ the same " in the equality 
tasks and that the bottles containing the larger amount contained “ more " in the inequality 


FIGURE 1 


Тнк Four TASKS IN THE ' BOTTLES TEST’ SHOWING THE POSITION OF THE BOTTLES 
BEFORE AND AFTER TRANSFORMATION 


BOTTLE TASK ONE 


Before | і АНег 


Before АНег 

BOTILE TASK THREE 
Before After 
After 
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tasks. Following this question, one of the pair of bottles was then turned upside down and 
placed beside the other. The ‘ experimental question-bottles? was then repeated and the 
subject's response recorded. The initial and post transformation presentations for the bottles 
tasks are shown in Figure 1. 


Following the ‘ Bottles Test’, the alternative assessment procedure mentioned earlier 
was piloted using plastic bags filled with liquid, but this will not be detailed here as various 
methodological problems were encountered which complicated the analyses. 


RESULTS AND DISCUSSION 


Scores, on the two traditional verbal tests (omitting the items involving transformation 
(c), division) and the bottles test varied between 0/4 and 4/4 overall and between 0/2 and 2/2 
for equality and ini пу tasks separately. We previously defined full conservers as those 
scoring 3/3 for equality and inequality tasks separately (Wheldall and Poborca, 1980) but 
in this study full conservers are defined as those scoring 4/4 overall or 2/2 for equality and 
inequality separately. Assuming these conventions, the results of the alternative assessments 
may be summarised as follows: the same number of children (seven: 17:5 per cent) conserved 
fully on all three assessments and these were the same seven children in each case. This 
applied to overall score and to equality and inequality separately. Most (24) of the remaining 
33 children scored 0/4 on all three assessments, tbe remaining nine scoring 1/4 or 2/4 on one 
or more of the assessments. This close comparability in performance over the three assess- 
ments is expressed clearly in the inter-correlation matrices shown in Table 1. Note that the 
bottles/traditional verbal test correlations are as high as the test/retest correlations for the 
traditional verbal test. 


TABLE 1 


INTER-CORRELATION MATRICES FOR OVERALL PERFORMANCE 

AND EQUALITY AND INEQUALITY PERFORMANCE SEPARATELY 

BETWEEN THE Two TRADITIONAL TESTS OF LIQUID CONSERVATION 
AND THE ‘ BOTTLES Test’ (P « 0-001 FOR ALL COEFFICIENTS) 


Overall performance 
Traditional test 1 Traditional test 2 
Bottles test 0-94 0-96 
Traditional test 1 — 0:95 
Equality performance 
"Traditional test 1 Traditional test 2 
Bottles test 0-91 092 
Traditional test 1 — 0:94 
Inequality performance 
Traditional test 1 Traditional test 2 
Bottles test 0-87 096 
Traditional test 1 — 090 


It can thus be seen that the ‘ bottles test’ is as effective as the traditional verbal test in 
determining full conservers, both procedures identifying the same individuals as conservers 
and non-conservers. The results of this study, therefore, suggest that the ' bottles test ' is an 
acceptable, alternative procedure for assessing liquid quantity which has the advantage of 
being both quick and simple to administer, given that psychologists and others will want to 
continue testing children using verbal procedures of this kind. 


It must be recognised, however, that the ‘ bottles test’ was administered after both 
traditional and verbal tests and also with a non-verbal assessment procedure intervening 
between the two verbal tests. It would have been preferable to have controlled for order of 
presentation effects but this was not possible in what was, in effect, an additional pilot investi- 
gation tacked onto the main replication study of the Wheldall and Poborca findings. Order 
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of presentation of tests will be examined, however, in a forthcoming follow-up study which 
will develop the * bottles test ' further by incorporating more tasks and introducing procedural 
refinements. 
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RE-EXAMINATION OF THE COVARIATION OF FIELD INDEPENDENCE, 
INTELLIGENCE AND ACHIEVEMENT 


Ву J. J. ROBERGE AND В. К. FLEXER 
(Department of Educational Psychology, Temple University, Philadelphia, USA) 


SUMMARY. A painel de factor analysis was performed on a a of correlations among 
measures of field independence, intelligence, and achievement in reading and mathematics. 
Factors of general intelligence and verbal ability were identified. Field independence shared a 
substantial amount of variance with general intellectual ability. 


INTRODUCTION 


In a recent article in this Journal, Satterly (1979) described the results of hierarchical 
factor analyses of scores on measures of cognitive style, intelligence, and achievement in 
mathematics, English, and geography. Satterly reported that factors of general intelligence, 
field independence, levelling-sharpening, and verbal ability were identified. In addition, he 
concluded that " these findings provide some support for the independence of cognitive 
style from general intelligence " (p. 181). However, Satterly's findings and conclusions 
with regard to the field independence factor are clouded by the fact that, in both of his 
analyses, the loadings of the achievement tests on the field independence factor were of the 
same magnitude as those for the embedded figures test. Finally, while some researchers 
(e.g., Witkin et al., 1977) maintain that field independence is an educationally relevant indi- 
vidual difference variable, other researchers (e.g., Vernon, 1972) contend that field indepen- 
dence does not define а unique factor which is distinct from general intelligence. Thus, the 
aim of the present investigation was to re-examine the covariation of field independence 
with intelligence and school achievement by administering standardised tests of these con- 
structs to a Jarge sample of elementary school pupils. 


METHOD 
Sample 
Four hundred and fifty pupils were randomly selected from classes in a suburban public 
school. Seventy-five boys and 75 girls were chosen at each of three grade levels (i.e., 6th, 7th, 
and 8th) Mean chronological ages at these grade levels were 11:33, 12:36, and 13-28, 
respectively; mean IQs were 115-50, 113-61, and 112-57, respectively. 


Instruments 

The Group Embedded Figures Test (Oltman et al., 1971) was used to assess field inde- 
pendence. This test consists of seven practice items and 18 test items that require the pupil to 
locate particular simple shapes which are embedded in complex figures. The pupil's score on 
this test is the number of simple forms outlined correctly in a given period of time. 

IQ scores were obtained by administering the Lorge-Thorndike Intelligence Tests (Lorge 
and Thorndike, 1957). School achievement was based upon pupils’ standard scores on the 
Metropolitan Achievement Tests (Durost er al., 1970). The reading achievement measures 
were the Word Knowledge and Reading tests; the mathematics achievement measures 
"ee the Mathematics Computation, Mathematics Concepts, and Mathematics Problem 

ving tests. 


RBSULTS AND DISCUSSION 


'The correlations among the variables are shown in Table 1. An examination of the 
correlation matrix reveals a sex difference favouring girls on the intelligence test (P « 0-05). 
There are also significant (P « 0-001) positive relationships among scores on all of the stan- 
dardised tests. 

Tbe correlation matrix was analysed further by using a principal factor analysis. Two 
factors were rotated to oblique simple structure by means of a biquartimin solution; tbe 
correlation between these first-order factors was negligible (0-01). The results of this analysis 
are shown in Table 2. 
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TABLE 1 
PROPUCT-MOMENT CORRELATIONS AMONG THE VARIABLES 





Variable 1 234567 
1 Sex 
2 Field independence -04 
3 Intelligence 11 45 
4 Word knowledge —01 42 58 
5 Reading 06 38 58 77 
6 Mathematics computation 00 45 55 59 58 
7 Mathematics concepts —07 47 61 63 61 80 
8 Mathematics problem solving —09 48 61 63 65 81 80 


Note: Decimal points omitted 





TABLE 2 
Котаткр FACTOR STRUCTURE MATRIX 
Factor 
Variable І П 

1 Sex —0-01 0-17 
2 Field independence 0:54  —003 
3 Intelligence 0-71 0-12 
4 Word knowledge 0-79 0:31 
5 Reading 0-80 0-37 
6 Mathematics computation 084  —024 
7 Mathematics concepts 0-87 —0-21 
8 Mathematics problem solving 0:89 — —0-20 





Note: Factor I = general intelligence; 
Factor П = verbal ability. 


Factor I is a general intelligence factor. The field independence, intelligence, and 
achievement tests all load highly on this factor. Nevertheless, the amount of variance that 
the field independence test shares with this factor is considerably less than the amount of 
variance which the intelligence and achievement tests share with this factor. Factor П is a 
verbal ability factor which is defined by the word knowledge and reading tests. Sex also has 
a positive loading on this factor which, given the large sample size, might be replicable. 

Interestingly, the two factors found in the present study are similar to factors reported 
by Satterly (1979). But, in contrast to Satterly’s findings, there is little evidence of a field 
independence factor distinct from general intelligence which could be used to make predic- 
tions that cannot be made from standardised tests of intelligence and school achievement. 
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WISC-R CORRELATES OF ACADEMIC ATTAINMENT AT 16$ YEARS 


W. YULE,* В. D. GOLD} AND CAROL BUSCH* 
(*University of London, Institute of Psychiatry and Isle of Wight County Education 
Psychological Service) 


Summary, As part of a longitudinal study, a randomly selected group of 82 adolescents were 
tested on both WISC-R and academic attainments. This paper reports the relationships between 
WISC-R and attainment, and with CSE and GCE results. 


INTRODUCTION 

Everyone agrees in principle with the need to restandardise individual psychological 
tests at regular intervals in order to take account of any secular changes. The revised version 
of the WISC, the WISC-R (Wechsler, 1974), appears on average to yield scores some 4 to 7 
points lower than the old version, in part reflecting the greater test sophistication of present 
day children (Swerdlik, 1977; Kaufman, 1979). Unfortunately, a direct effect of adopting 
the WISC-R is that all the data, built up over many years, relating WISC scores to academic 
attainment are immediately obsolete. WISC-R scores cannot be substituted in prediction 
equations for WISC scores, nor can we assume that the correlations between WISC-R and 
attainment will be of the same order as those between WISC and attainment. We urgently 
need empirical studies, preferably with unbiased samples, to shed light on these questions. 

In the USA, a small number of studies have already been reported relating WISC-R to 
attainment scores, usually on the Wide Range Achievement Tests (Jastak and Jastak, 1965). 
Most of the studies have reported data gathered on children referred to school psychologists 
for evaluation of learning problems or behaviour difficulties, and so are not fully repre- 
sentative. 

TABLE 1 


SUMMARY OF RELATIONSHIP BETWEEN WISC-R AND WRAT 


WRAT 
WISC-R Reading Spelling Arithmetic 

Brooks (1977) N = 30; 6-10 years 

V.S.IQ 64 55 74 

P.S.IQ 71 70 71 

F.S.IQ 70 65 76 
Hartlage and Steele (1977) М == 36; 

Mean age — 7 yrs 9 montbs 

V.S.IQ 75 35 76 

P.S.IQ 54 33 67 

Р.510 68 35 76 
Schwarting and Schwarting (1977) М = 282; 6-16 years 
(а) 6-11 years  У.5ЛО 68 61 69 

P.S.IQ 63 60 69 

F.S.IQ 72 65 75 
(b) 12-16 years V.S.IQ 74 69 66 

P.S.IQ 40 34 55 

F.S.IQ 62 56 66 
Hale (1978) N = 155; 6-16 years 

V.S.IQ 54 49 64 

P.S.IQ 29 26 44 


Full Scale correlations not quoted. 


Decimal points omitted. 
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The results of four published studies are summarised in Table 1. It can be seen that 
within these atypical samples, the correlations between WISC-R IQs and attainment are all 
positive and generally quite high. As із made clearer in the Schwarting and Schwarting (1977) 
study, the correlations may be higher during elementary school than in secondary school. In 
their study the correlations between Performance Scale IQ and attainment are lower for 
secondary age children than those with Verbal Scale IQ which remains constant across the 
twoagegroups. Interestingly, there is a slight tendency for WISC-R to correlate most highly 
M Arithmetic, next with Reading, and for the correlation with Spelling to be noticeably 

ower, 


Stedman ег al. (1978) attempted to relate Kaufman’s (1975) factor scores to WRAT 
attainment scores in a population of 76 children aged 6 to 13 years. Although significant, the 
correlations were all very low which suggests that the overall IQ scores used in the other 
studies yield more stable predictions. 


To date, there are no studies relating WISC-R to academic attainment in British school- 
children. This paper provides such data for a sample of school leavers. 


METHOD 


As part of a prospective study of children with learning difficulties, a control group of 
150 children were originally tested on the WPPSI at age 54 years (Yule, et al., 1969). Eighty- 
seven of these children were identified in schools in the Isle of Wight shortly before reaching 
the compulsory school leaving age. They were tested on the WISC-R together with a battery 
of group tests of attainment—NFER Sentence Reading Test NS6; Vernon Graded Word 
Spelling Test (Vernon, 1977); Vernon Graded Arithmetic-Mathematics Test (Vernon 
and Miller, 1976)—in the spring term of 1979 when aged 164 years. Because of persistent 
absence, two children were not tested on WISC-R and a different three children did not 
complete the attainment testing. Thus, only 82 were tested on both batteries. Data were 
also gathered on their performance in public examinations. Further details of the overall 
study are given elsewhere (Yule єї al., 1981a, 1981b). 


Essentially, these 87 adolescents form a normal control sample for the other studies. 
We have‘demonstrated (Yule et al., 19815) that they are representative of the original 150 in 
terms of their scores on the WPPSI. The social class composition of the sample is shown in 
Table 2. The untraced children contained an excess from Social Class I and П, who pre- 
sumably did not continue in state schools. However, the reduced (tested) sample thereby 
becomes more representative of the general population. The children were tested individually 
on WISC-R by experienced psychologists. The group testing was carried out in schools by 
the two principal investigators. 





TABLE 2 
ЅОСТАІ, CLASS COMPOSITION OF THE TESTED AND UNTRACED 
SAMPLE 
Populatie 
ted Untraced on 
Tes n 960) 
Social Class N% N % 
I&II 18 207 23 365 18 
IH non manual 9 103 8 127 1 
man 42 483 .17 270 48 
IV&V 16 18-4 13 20:6 22 
Unknown 2 23 2 32 1 
Total: 87* 63 


* 2 children were tested on attainment tests but not 
WISC-R as they were persistently absent; 3 children were 
tested on WISC-R but not on the attainment tests. 
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RESULTS 


Table 3 shows the means and SDs of this sample on the WISC-R, Reading, Spelling and 
Mathematics tests. Table 4 shows the intercorrelations among the tests. It can be seen that 
the average Full Scale IQ of this sample is very close to 102 with a standard deviation of 
15:40. Given that this reduced sample d is more representative of the general population in 
terms of fathers" social class than was the original sample of 150, then it can be concluded 
that the WISC-R is reasonably normed for British school leavers. 

The intercorrelations with reading are clearly similar to those found by Schwarting and 
Schwarting (1977) as shown in Table 1. Verbal Scale IQ is more closely correlated to reading 
scores at 164 years than is Performance Scale IQ. The intercorrelations between WISC-R 
IQs and spelling are almost identical with those yielded in the American study of 12- to 
16-year-olds. The correlation between IQs and Mathematics are higher than those with 
reading and spelling, a finding not reflected in the American studies in Table 1. 

The conclusion that can be drawn from these data is that Verbal IQ is a reasonable 
predictor of academic attainment, sharing approximately 50 per cent of the variance with 
scores on tests of reading, spelling and mathematics. 

Table 5 shows the relationship between WISC-R full scale IQ and examination results on 
CSE and GCE. The relationship is highly significant (y2 = 25:96; df = 4; P<0-0001). 


TABLE 3 
MEANS AND SDs on “New ATTAINMENT TESTS 











Test Mean sD 
WISC-R Full Scale IQ 102.09 15:40 
Verbal Scale IQ 99-44 1606 
Performance Scale IQ 104-78 15-30 
Reading (Raw) 48-33 9-82 
Spelling (Standard) 99-17 15-45 
Mathematics (Standard) 98-49 14-51 
TABLE 4 


INTERCORRELATION BETWEEN WISC-R AND ATTAINMENT (М = 82) 


Reading Spelling Mathematics 





WISC-R Verbal Scale IQ 0-68 0-69 0-79 
Performance Scale IQ 0-41 0-31 0-64 
Full Scale IQ 0-61 0-58 -0-80 

TABLE 5 


WISC-R Еши, SCALE IQ AND EXAM PASSES 


No CE:D,E GCE:A,B,C 
рее анг зак ES or CSE 1 





FS IQ less than 90 4 11 2 17 
90-110 9 17 17 43 
More than 110 0 3 21 24 

13 31 40 84 





х2 = 25:96 with 4 df: P < 0-0001 
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The implication of this table is that а 16-year-old with an IQ of below 90 has only about а 
12 per cent chance of obtaining a GCE grade A, B or C, or a CSE grade 1, but this is better 
than many might anticipate. If the adolescent has an IQ of above 110, then there is over an 
83 per cent likelihood of attaining at least one examination at this high standard. In inter- 
preting these findings, it must be remembered that the secondary schools on the island had 
been fully comprehensive throughout the time the pupils concerned had been there. 


DISCUSSION 


The results of this study strongly suggest that the WISC-R can be used to assess Full 
Scale IQ with British school-leavers. The correlations with attainment tests of reading, 
spelling and mathematics compare favourably with studies previously published in the USA. 
These data should be particularly helpful to those involved in vocational guidance with school 
leavers. The data relating IQ to public examination results should likewise fill a gap in the 
data base on which many crucial decisions have to be taken. 
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AUSTRALIAN AND FILIPINO INVESTIGATIONS OF THE INTERNAL 
STRUCTURE OF BIGGS’ NEW STUDY PROCESS QUESTIONNAIRE 


By J. HATTIE 
(University of New England) 


Амр О. WATKINS 
(Australian National University) 


SUMMARY. The internal structure of the new Study Process Questionnaire (Biggs, 1979) was 
investigated with samples of 255 Australian and 173 Filipino university students, The internal 
consistency reliabilities, item and subscale factor analyses were quite favourable for the Aus- 
tralian sample, supporting Biggs' model of the study process complex. However, the low 
to moderate reliabilities and failure of factor analysis to support Biggs' model indicates that the 
SPQ may not be suitable for Filipino students. 


INTRODUCTION 


Recent research has made clear that the naive assumption that there is such а thing as 
* good ! study methods which leads to academic success is unfounded (Lafitte, 1963; Entwistle 
et al., 1971). Rather it has been demonstrated that the relationships among study methods, 
academic success, the context of learning and characteristics of the individual student are 
complex (Biggs, 1976, 1978). Biggs (1978) developed the Study Process Questionnaire 
(SPQ) as a means of operationalising the study process domain. This inventory consisted of 
80 items divided into ten unidimensional scales. It has been shown that this version of the 
SPQ had moderate scale internal consistency reliabilities and there was also support for 
Biggs’ proposed three dimensional (Reproducing, Internalising, and Achieving) underlying 
model of the study process domain (Biggs, 1978; Watkins and Hattie, 1980). 

Biggs' latest version of the ЗРО, which is the focus of this study, is based on the propo- 
sition that students tend to have several broad motives for studying and several broad 
strategies for going about their work. Не argues that, while many students have mixed 
motives and strategies, they are usually motivated in one particular way and their study 
strategy is compatible with their motive. Based on his earlier research, Biggs considers the 
three most important motive/strategy dimensions to be the following: 


(1) Utilising 
Motive: to undertake further study as a means for obtaining a better job, more 
money, or some other extrinsic need. 
Strategy: overall, simply to avoid failure and specifically to focus on minimal 
content, primarily factual, as prescribed in class handouts, course outlines, etc., and 
to rote learn this necessary minimum for reproduction in examinations and/or 
assignments. 

(2) Internalising 
Motive: to work out one's philosophy of Ше and to develop special interests and 
abilities; studies are selected therefore that hold maximum intrinsic interest. 


Strategy: to read widely and with maximal understanding (independently of course 
requirements), to integrate various subjects and make them personally m : 


(3) Achieving 
Motive: to excel in studies as part of a general competitive approach to life and win 
high status thereby; more specifically, to study with a view to maximising grades 
awarded. 
Strategy: close orientation to course outlines, work schedule tightly organised, 
assignments completed on time, etc. 
(from Biggs, 1979, p. 2) 
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This latest version of the SPQ (which will be the only version discussed in the remainder 
of this article) consists of 42 items each tapping one of the three broad dimensions presented 
above and each divided into motive and strategy subscales of seven items in length. 

The aim of this research was to investigate the following aspects of the SPQ when 
administered to Australian and Filipino university students; 


а) The internal consistency of the scales and subscales. 
) The factor structure of the SPQ items. 
(c) The factor structure of the SPQ subscales (with particular reference to Biggs’ motive/ 
strategy model of study processes). 


METHOD 
Sample 


The Australian subjects were 255 first year, full-time undergraduates at the University 
of New England, a small rural university in northern New South Wales. The Filipino 
sample consisted of 173 freshmen attending the College of Liberal Arts and Sciences at the 
University of San Carlos, a major university in the central Philippines. English was 
the language of instruction at this university, and two Filipino educationalists considered the 
items relevant to and comprehensible by Filipino tertiary students. 


Procedure 

The Australian data were collected through a mail survey. The Filipino subjects com- 
pleted the SPQ during regular lecture time. Due to an unfortunate proof-reading error three 
items (one from each of the Utilising Motive, Internalising Motive, and Achievement Strategy 
sub-scales) were omitted from the SPQ when administered in the Philippines. Therefore 
when Filipino internal consistency coefficients are reported below for these subscales and the 
corresponding total scales, the values reported are corrected for length. 


RESULTS AND DISCUSSION 
The internal consistency reliabilities, coefficient а, are reported in Table 1. It can be 


seen that these values were very adequate for the Australian students and fairly encouraging 
for the Filipinos, considering the latter's lesser familiarity with English. 
TABLE 1 


cee OF INTERNAL CONSISTENCY DATA FOR 
THE AUSTRALIAN AND FILIPINO STUDENTS 


Coefficient « 
Australia Philippines 


SPQ Scales 

Utilising 0-75 0-58* 
Int 0.79 0709 
Achieving 0.77 0-68* 
SPQ Subscales 

Utilising Motive 0-60 0-51* 
Utilising Strategy 0-69 0:51 
Internalising Motive 0-67 0.57* 
Internalising Strategy 0-72 0:60 
Achieving Мо 0:70 0.57 
Achieving Strategy 0:74 0:57“ 


* Corrected for length (see Procedure). 


Factor structure 

The SPQ items were analysed using unrestricted maximum likelihood common factor 
analysis (Jóreskog, 1969). Given the certainty of non-linear relations between the items, which 
is typical with such data, rather than use statistical criteria or other heuristics for choosing 
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the number of factors, the decision as to the number of factors was made solely on the grounds 
as to whether the factors could be interpreted. For the Australian sample two, three and 
six factor solutions were interpretable. For the six factor solution, the six scales outlined by 
Biggs were clearly evident. For the three factor solution, the first factor related to inter- 
nalising with some high loadings on utilising strategy; the second factor related to utilising 
with off loadings on achievement motivation; and the third factor related to achievement 
strategy. With only two factors extracted, the first related to strategy and internalising motive, 
and the second to achievement and utilising motivation. 

The Filipino data clearly came down to a two factor solution: one factor relating to 
motivation and the other to strategy. The six and three factor solutions were not clearly 
interpretable 

From these item analyses, two hypotheses were generated to be tested on the subscale 
Scores. The first hypothesis was based on two factors—a motive and a strategy factor. The 
second hypothesis consisted of a utilising, an internalising, and an achievement factor. The 
latter model corresponded to Biggs’ model of the study process domain. Confirmatory 
maximum likelihood factor analysis was used to test these two models. In confirmatory 
factor analysis a pattern of loadings, with many constrained (usually to zero) and the rest free 
to vary, can be tested and a x? statistic calculated to evaluate the goodness of fit between the 
observed and expected matrix. Given the sensitivity of x? to large sample sizes (see Mulaik, 
1975) and given that we are primarily interested in evaluating which model best fits the data, 
then we can use the difference in x? from the two models (with Adf = df; —df2) to evaluate 
which model is most descriptive. McDonald and Leong's (1976) analysis of covariance 
program was used. The results are presented in Table 2. 


TABLE 2 


GOODNESS oF FIT ОЕ DATA TO HYPOTHESISED UNDERLYING 
MODELS FOR AUSTRALIAN AND FILIPINO STUDENTS 





Australia Philippines 

x2 df x df 
Hypothesis 1 
(Motive/Strategy) 144413 8 52:96 8 
Hypothesis 2 
(Utilising/Internalising/ 
Achieving) 25.72 6 47-67 6 
Difference between 
hypothesis 1 and 2 118-41 2 5:29 2 


For the Australian group the three factor hypothesis was statistically significantly better 
than the two factor hypothesis. For the Filipino data statistically there was no difference in 
the 4x? (probability of difference — 0-08). Hence on parsimony grounds alone we should 
prefer the two factor model.—this was confirmed by examination of the factor patterns. 

Thus this factor analysis of the SPQ subscales lent support to Biggs’ motive/strategy 
model of study processes from the Australian but not the Filipino data. 


CONCLUSIONS 


This investigation of the internal structure of the SPQ provided very satisfactory results 
from tbe Australian sample—adequate to good internal consistency coefficients; item factor 
analysis which supported the existence of Biggs' subscales of the SPQ; and a subscale factor 
analysis which supported the validity of Biggs’ model of the study process domain. The 
SPQ can then be recommended for further use with Australian students. 

Unfortunately the Filipino data (with their low to moderate internal consistency coeffic- 
jents and factor analyses which failed to support Biggs’ model) suggest that the SPQ may not 
be appropriate for use with Filipino students. 
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The writers consider that further research is required with a wider range of Filipino and 
Australian students before it is possible to determine if the results of this study are a reflection 
of true linguistic, educational, or personological differences between students of these coun- 
tries or are simply attributable to sampling error. И is certainly true that earlier research 
has indicated that major differences exist between Filipino and both Australian and US 
university students' views of the aims and methods of tertiary education (Watkins and Mali- 
mas, 1980). However, of course, such findings do not necessarily indicate that the same 

measuring instruments are not valid in these countries. Indeed research on the US developed 
Inventory of Learning Processes (Schmeck єї al., 1977) with the same Australian and Fili- 
pino subjects as used in this study found more favourable factor analytic evidence for the 
anay fet inventory with the Filipino rather than the Australian sample (Watkins and 
ttie, З 

Additional evidence is thus required to investigate to what extent the SPQ is country 

bound and also to provide support for the validity of the SPQ as a measure of study methods. 
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THE DEVELOPMENT OF A SELF-ESTEEM QUESTIONNAIRE 


Bv D. LAWRENCE 
(Churchlands College of Advanced Education, Perth, Western Australia*) 


Summary. The questionnaire (LAWSEQ) was constructed to measure self-esteem in primary 
school children. The questions were selected to cover the main areas of concern shown by 
children in an earlier counselling study. There were 20 questions in the original version, including 
four innocuous ones, and with a parallel form (Form B). In the final version the uestionnaire 
comprises 16 items, chosen as a result of an item analysis carried out by the University of Bristol 
Department of Child Health. 


INTRODUCTION 


It is almost 100 years since William James brought the Self out of the realms of phil- 
osophy and defined it as a legitimate study for the psychologist, and yet there is still a lack of 
consensus regarding a proper definition of self-concept. There is a lack of agreement not 
only at the conceptual level but also with regard to methodology in the assessment of self- 
concept (Wells and Marwell, 1976). 

For William James the Self was comprised of four parts—the physical self, the material 
self, the social self, and the spiritual self—and was of a wholly conscious origin. This view is 
to be contrasted with the psychoanalytical definition which included unconscious processes 
(Freud ef al). Since Freud a vast number of psychologists have attempted to define the 
concept, notable among these being Allport (1937), Mead (1934), Symonds (1951), Maslow 
(1968), Cooley (1902), Jourard (1957) and Rogers (1951). Although these writers developed 
the theme in different ways they were all in agreement with William James' original definition 
of the self-concept as a hypothetical construct which is reflexive, i.e., the * knower ’ and the 
* known ' are the same person. 

It was not until the work of Diggory (1966) and other social psychologists that the par- 
ticular aspect of the Self known as self-esteem became a common object of study. 

Self-esteem is usually defined as a personal judgment of worth lying along a dimension 
with * positive ' and * negative ' ends (Cottle, 1965). In addition, it is usually defined in terms 
of self-attitudes which have an emotional and behavioural component (Rogers, 1951). 
There is considerably less agreement over whether the term self-esteem should be reserved 
for a * global ' attitude, as originally defined by Allport (1937), or whether it should be used 
in the William James sense of specific attitudes. Cohen (1959) referred to the self-esteem as 
the degree of correspondence between the * ideal self ° and the ‘ actual self’, whereas Argyle 
(1967) regarded the self-esteem as the individual's effective evaluation of this discrepancy. 
Whether one agrees with Brownfain (1952) that self-esteem is the degree to which a person 
accepts him or herself, or whether with Allport that it is the person's evaluation of aspects of 
him or berself—the feeling of worth (Coopersmith, 1967)—there can be little doubt that this 
aspect of the self-concept known as self-esteem is beginning to be of some practical signifi- 
cance in educational psychology. Many writers have shown a relationship between scholastic 
achievement and self-esteem (Brookover et al, 1964). The present author, using the 
LAWSEQ, discovered a small but significant correlation between reading attainment and 
self-esteem. This was substantiated by Barker (1979) who also showed a positive correlation 
between the LAWSEQ and arithmetic scores. 


Measurement of self-esteem 

The various methods used in the assessment of self-esteem inevitably reflect the concep- 
tual disagreements. These are admirably reviewed in Wylie (1961, 1974), Crandall (1973) 
and Burns (1979). 

One of the most common methods has been the verbal method and in particular the self- 
report questionnaire (Piers and Harris, 1964; Coopersmith, 1967). The semantic differential 


* This study was carried out while the author was Chief Psychologist to the Somerset County 
Council Education Department, England. 
H 
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developed by Osgood et al. (1957) using bi-polar adjectives is often used and depends on 7- or 
5-point scales. Many variations of these have found favour with different authors, but it is 
not the intention here to attempt a review of them, merely to make the point that there i is still 
a lack of consensus on the best methods to use, and it was this which led the present author to 
follow a lead first discovered in the counselling of retarded readers (Lawrence, 1973). Case 
studies from these counselling programmes revealed that children of 8 to 11 years of age 
tended to be concerned about the opinions of others in three main areas—(1) the opinions of 
peers, 0) the opinions of teachers and (3) the opinions of parents. This seemed to support 
Cooley's ‘ looking glass ' theory of self (Cooley, 1902). Accordingly, it was decided to define 
self-esteem as the child's affective evaluation of the sum total of his or her characteristics both 
mental and physical. 


METHOD AND RESULTS 
The LAWSEQ 


The LAWSEQ was introduced originally as a series of 30 adjectives. They were selected 
on the basis of frequency of appearance in the counselling case studies. One hundred and 
twenty-seven children from three junior schools in Weston-super-Mare, aged 9 to 10 years, 
were asked to rate themselves either YES, NO ог DON'T KNOW on each adjective, e.g., friendly, 
brave, dirty, jealous, etc. Later, the same children were asked to rate themselves on the same 
list as the sort of person they would like to be. Thus two sets of scores were obtained for 
each child —(1) self-image (ii) ideal-self. The discrepancy between the two scores was regarded 
as the measure of self-esteem. 

These children were then assessed on the Burt Word Recognition Test and a correlation 
of 0-394 was found (P « 0-025) between LAWSEQ and Reading Age. 

Although the instrument appeared to have potential as a means of measuring self- 
esteem, its form of presentation had led to several misunderstandings and altogether it was 
considered to be a crude test in need of elaboration. Accordingly, a series of 40 questions 
was devised and another 40 questions in parallel form. Both questionnaires were adminis- 
tered to a random sample of 76 9-year-olds. The results from the two forms of the ques- 
tionnaire were then compared. Questions showing less than 80 per cent agreement were 
discarded. This left 16 items in both Forms А and B. Four other questions of an innocu- 
ous nature were then added to make the questionnaires a little less threatening, making 20 
questions in all (Appendix 1), and a parallel form of another 20 questions (Appendix 2). 

The Forms A and B of this new version of the LAWSEQ were then administered to 431 
9-year-olds and a correlation of 0-83 was found between both forms (P « 0-01). 


The LAWSEQ final version 

In 1979 Walter Barker of the University of Bristol Child Health and Education Study 
(CHES) carried out an item analysis of all the questions in the А and B forms. 

Four hundred and nineteen children aged 9 years filled in both questionnaires and also a 
short form of the Edinburgh Reading Test. Four hundred and twenty children completed 
the Friendly Mathematics Test. | | 

The discriminating value of each item in relation to the questionnaire as a whole is 
described using a beta statistic. The‘ Р’ values indicating the V-ridge probabilities relate to 
the Student‘ t ° values of the coefficients. Clearly, items with P values above 0:02 are of little 
use; also those with P above 0:5, as they are predicting in the opposite direction. 

The ‘unique’ value refers to the unique contribution of each item to the particular 
regression in which it featured —the percentage variance in reading or arithmetic accounted 
for by the particular item beyond the contribution made by ali the other items in combination. 

Separate regressions examined the item performance when boys and girls reading samples 
were analysed on their own. 

As a result of this analysis 12 questions were considered to be particularly discriminating 
(four from Form А and eight from Form B). These items are starred in Table 1. Another 
four innocuous questions were added to these, making a final questionnaire of 16 items. 

This final version was used in the National Child Health and Education Study under the 
direction of Professor Neville Butler and was given to а sample of 15,000 boys and girls in 
the United Kingdom who were born during the week 5th-11th April, 1970. (Anyone wishing 
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LAWSEQ PUPIL QUESTIONNAIRE 


(FINAL VERSION) 


Do you think that your porum usually like to hear about 
your ideas? ................. EE пете m Sess gen 


Do you often feel lonely at school? ............. TNT | 
Do other children often break friends or fall out with you? 
Do you like team games? .......... нання 


Do you think that other children often say nasty things 
about: you. eee vec ПИ aseo eoe sa pa pear eoa eee РҮ ЕКЕУ жЕ. 


When you have to say things in front of teachers, do you 
usually feel shy нина 


Do you like writing stories or doing other creative writing? 


Do you often feel sad because you have nobody to play with 
арен enter eei Pase ee MM YE ОС 


Are you good at mathematics ?.............. «оаа 


Are there lots of things about yourself you would like to 
CHANGE qq MEE 


. When you have to say things in front of other children do 


you usually feel foolish? .............. о.е 
Do you find it difficult to do things like woodwork or 
knitting? 


Memsessssveesosessontéttepocevaseerevsoseevoonaveveceesoaposvon 


When you want to tell а teacher something, do you usually 
feel: foolish О redit nr нь 


. Do you often have to find new friends because your old 


friends are playing with somebody else?........................ 
15. Do you usually feel foolish when you talk to your parents? 
16. Do other people often think that you tell lies?.............. У 
5совла Key: 


Questions 4, 7, 9, 12 аге distractors. 
Score --2 for YES answer to Q. 1. 


Score +2 for NO answers to scored questions. 


remaining г 
Score 4-1 for DON'T KNOW answers to scored questions. 


Score 0 for all other possibilities. 


Yes 


Maximum possible score in the direction of high self-esteem +24. 


No 


247 


Don't 
Know 


248 
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to use the questionnaire should contact the author c/o Psychological Service, Education 
Department, The Mount, Taunton, Somerset, England.) 











TABLE 1 
LAWSEQ CHARACTERISTICS 
I A d В‹ ) Maths‘ ) R (2) Reading? 
t 1 2 eading? 

uso. Кара ВУ. А ding Girls 
No. Maths Reading P unique % t unique % unique % unique % 
A1 0:57 0-50 0-4 0-004 — 0-9 0-09 0-21 0-02 
A2 0-40 0-40 0:05 0:22 14 0-18 0-33 0-11 
A3 0-34 0-33 0-6 0:002 —3.9 1-34 0-81 179 
АД * 0-35 0-43 0-08 0-16 2.2 0:43 0-92 0-004 
Аб * 0-49 0-26 0:01 0-47 12 0-12 0-00 0-82 
А7 —0-05 0-09 092 0-41 16 0:24 0-58 0-006 
А8 0-24 0:21 0:9 0:17 — 2.3 0:49 0:36 0-29 
A10 0-54 0-29 0:6 0-01 —3:0 0:82 1.23 0-18 
А11 0-13 0-29 0-13 0-11 13 0-16 0-03 0:42 
А12* 0:42 0:36 0-007 0-53 44 174 1-93 105 
А14 0:37 0:37 0-02 0:36 —0.6 0:02 017 098 
А15 * 0-45 0:52 02 0-04 29 0:65 0-63 0-26 
A16 0-34 0-44 0:3 0-02 0:3 0-009 0-03 0-27 
А17 0-41 0-52 09 1-12 0-3 0-006 0-11 0:24 
А19 0:32 0-34 0-05 0-23 0:8 004 0-03 0-01 
А20 0-49 0-50 020 0:05 13 0:13 0-08 019 
Bl 043 029 0-3 0-03 -34 105 169 0-32 
B2 * 0-59 0-37 00001 122 0-6 0-04 0:49 0-03 
B3 —0-02 0-23 0-0000 48 19 0:33 0-06 0-88 
B4 0-31 0-35 0-11 0:13 0-4 0:01 0:03 0:15 
B6 * 0-53 0-44 0-0001 13 17 0:27 0-81 0-01 
B7 * 0 44 029 092 0:36 34 1-05 1:17 0-79 
B8 * 0:51 0:25 0-0001 1.19 24 0-39 1:34 0-04 
B9 * 0:55 0-52 0-0000 1-95 141 0-11 0-60 0-001 
В11 * 0-54 0:44 0:0000 2:02 29 0:35 006 1:21 
В12 0-15 0:51 0-004 0-63 -09 0-07 0-20 0-001 
B13 0-40 0-32 0-04 0-24 0-05 0 0002 0-002 0-01 
В14 * 0-44 0:24 0-0000 418 27 0:56 0-92 0-20 
В15 0:46 0-73 0-8 0 05 — 2:7 0-55 0-29 0-60 
В16 0:35 0:38 0-0000 1:36 --1:5 0:18 0-08 0:21 
B18 * 0-50 0:23 09000 132 27 0:57 0:31 0:49 
B20 0:55 0 63 0-07 0-19 --0:9 0-06 0-005 0-15 

* Items chosen in final version of LAWSEQ. Reading N - 419 Maths М = 420. 


Note: (1) Correlations between individual items in LAWSEQ and the reading or maths scores. 
(2) The standard errors and probabilities of the regression coefficients were derived in all cases; only 
single sets of P or t parameters are printed here by way of illustration. 


DISCUSSION 


The relationship between scholastic attainment and personality characteristics has been 
one of the most interesting discoveries in educational psychology, and it is now recognised 
that the child's personality is an important factor in determining success. 


Attention has been drawn to one aspect of personality study which sbould be the direct 
concern of all teachers—viz., self-concept. Teachers are in a powerful position to be able to 
influence a child's self-concept (Staines, 1958). They should pay particular attention to the 
child with poor self-esteem. The child who has experienced regular failure comes to lack 
confidence as a person. He or she begins to expect failure and a remedial approach should 
take this into account. Indeed, there is some evidence to show that behaviour changes as the 
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self-image changes, and it may be useful to try to change the child's self-concept before 
attempting a more formal teaching of skills. Counselling programmes designed by the author 
specifically to improve the child's self-esteem have shown that it is possible to improve the 
child's reading attainment in this way. A current investigation into the counselling of 100 
retarded readers has shown similar results. This study is part of a Ph.D. programme con- 
ducted through the University of Bristol and will be published shortly. 

The LAWSEQ has been devised to assist in the identification of children who may 
suffer from poor self-esteem, Further research on the questionnaire is needed, particularly an 
evaluation of its temporal reliability, and it is hoped that others will find the results to date of 
sufficient interest to assist in its development. 


CKNOWLEDGMENT. І am greatly indebted to Walter Barker, Bristol University Child Health 
Unit for his analysis of is data and for his permission to include this (Table 1). I should also like to 
Dr. Philip Gammage, Department of Education, University of Bristol, for his valuable support. 
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APPENDIX 1 
SELF-ESTEEM QUESTIONNAIRE 


Form A 

1. Do other people usually think that you are friendly?.......... “ 
2. Do you feel silly when you have to talk to teachers?............... 
3. Do other children usually want to sit by you? ......... TTE "8 


* 4, и p think that your parents usually like to hear about your 
ideas? 12,5, КУУЛУКТУ Eo ОМ 


1 5. Do you often like to watch television? 





* 6. Do you often feel lonely at SChOOI?.......:scceseserseessvssseceneranes 
7. Do you usually find it easy to talk to teachers ?..................... 
8. Do you often wish you were somebody else? 

+ 9. Do you often like to listen to п1181С7.....,... нання 

10. Do you often feel silly in front of other children 7.................. 


11. Do you think that your parents usually trust you to do things 
properly l ——————M— 


#12. Do other children often break friends with you?.................. 
113. Do you like watching sports? ....ш нн 
14. Do you often feel silly in front of your parents? 


*15, Do "Ms think that other children often say nasty things about 
убір лена UO Meis TANE HAY eR ПАРУ 


16. Do you think that your teachers usually trust you to do things 
properly ыа уннн enne nennen nent nnne 

17. Do you think that your parents usually like to hear you talk? 
118. Do you like to go shopping for presents? ........................... 
19. Do you think that other people usually believe what you say? 
20. Do you feel that teachers usually like to hear about your ideas? 











* Chosen for final version of LAWSEQ. 
+ Innocuous questions. 


SCORING 
KEY A 
Score +2 for the following numbers answering YES: 1, 3, Hee 11, 16, ae 19, 20, 
4, 15. 


Score +2 for the following questions answering NO: 2, 6, 8, 10, 12, 
Score +1 for all numbers answering DONT’ KNOW. 


Score 0 for all other possibilities. 


1. 
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APPENDIX 2 


SELF-ESTEEM QUESTIONNAIRE 


Form B 
Do other children often ask you to play with them?............... 


* 2. Mt you noe to say things in front of teachers, do you often 


о. 
*11. 


. Do you often like to go to the sea-side? 
. Do other people often think that you tell lies?..................... 
. Do you sometimes like to go Йзһїпр?................................. 
. Are your teachers usually interested in listening to your ideas? 


y) - vea УР ava v queue ea esegue Он e vs are Yee e EURO 


. Do you find that you usually have someone to sit by in class? 
. Do your parents usually like you to talk to them? ............... 
. Do you often like to go for long walks in the country ?............ 
. Do you often feel sad because you have nobody to play with at 


SCHOO! ее ТОККО ГЛ КЕЛАТ 


. Are there lots of things about yourself you would like to change? 
. When you have to say things in front of children, do you often 


feel foolish, c. ees ics eeeeeksn eet e aes eoa аеро o ае o Tuve 


і У you want to tell a teacher something, do you often feel 


enubesbtoptenhnsupueseshhhpassecbeospaeseosepossenaveosenoosesaseosep 


Do you often like to go to Ше сїпешта?.............................. 


Do you often have to find new friends because your old friends 
are playing with somebody else? .................sesccseceseecesceneres 


. Do your parents often ask you to do important jobs 7............ 
. Do you think that the things people say about you are usually 


ni V "ECCE 


. Do you think that your parents are usually interested in listen- 


ing to the things you SAY ?7......:sccsccsecsecceccetcecceceetncnessensecees 


* Chosen for final version of LAWSEQ. 
1 Innocuous questions. 


SCORING 
KEY B 


Score +2 for the following numbers answering YES: 1, 3, 4, 12, 13, 15, 16, 20. 
Score +2 for the following numbers answering NO: 2, '6, "7, 8, 9, П, 14, 18. 


Score +1 for all numbers answering DON'T KNOW 
Score 0 for all other possibilities. 
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BOOK REVIEWS 


CARR, J. (1980). Helping Your Handicapped Child. Harmondsworth: Penguin 
Books, рр. 271, р. 81:95. 


SHEARER, А. (1980). Handicapped Children in Residential Care. A Study of Policy 
Failure. London: Bedford Square Press/NCVO, pp. 114, p. £4-95. 


Helping Your Handicapped Child is written primarily for parents of mentally handicapped 
children who wish to be actively involved in the education of their children. The book has 
two main parts, Part 1 deals with the methodology of behaviour modification; observation 
techniques; reinforcement; prompting, imitation; a consideration of some techniques which 
reduce inappropriate behaviour and recording methods. Part 2 gives practical examples of 
ways in which these methods can be applied and includes chapters on feeding, washing, 
dressing, toileting, Janguage, play and ways of overcoming phobias and obsessions. A final 
chapter suggests ways of maintaining programmes and deals with difficulties that might arise. 

So far the book may seem to resemble other behaviour modification primers, designed 
for those who have day to day contact with handicapped children, but this is entirely mis- 
leading. The great strength of this book lies in Dr. Carr's writing style and in her attention to 
details of presentation. The text is clear, largely devoid of jargon and frequently humorous. 
Important principles and aspects of practice are illustrated pictorially, graphically and anec- 
dotally in an attractive and convincing manner. Above all, there is no hint of the condescen- 
sion that characterises some books for parents. Liberal use of major and sub-headings plus 
summaries of chapter content assist its use as a reference book, not only for parents but also 
for the psychologist who, like the reviewer, is sometimes at a loss as to how to communicate 
concepts such as negative reinforcement to parents. Frequent page references to preceding 
and subsequent parts of the book do have the merit of ensuring that parents miss no content 
which may have a bearing on their child's problem but occasionally leaves the reader with a 
feeling of dislocation in an otherwise eminently readable book. Васі chapter ends with 
* practical problems ' which conform to reinforcement principles in that the reader is bound 
to achieve at least partial success. 

Those who have been involved in teaching the parents of handicapped children will 
know how difficult it із to convince even enthusiastic parents of the value of collecting base- 
line data and consistently recording progress. The emphasis placed upon quantifying obser- 
vations and records in the first, middle and last chapter is consequently justified. It is to be 
hoped, however, that parents who are totally unfamiliar with such designs as * reversal ' and 
* multiple baseline ', for example, are not discouraged from, reading on. 

Sections dealing with such emotive techniques as flooding, over-correction and restraint 
are bound to attract criticism from some psychologists on the grounds that they may be 
dangerously over-simplified. However, they are not given undue emphasis, nor are they 
presented without reservation and the importance of consulting a psycbologist if attempts fail 
15 made clear. To omit mention of these approaches would render the book incomplete. 

Are there any omissions? In my view, more attention could have been paid to task 
selection, particularly for parents of the very young or profoundly handicapped child. By 
adding a few pages to the book it would have been possible to include а simple form of 
prescriptive assessment for parents to use in identifying such tasks. 

In summary, Dr. Carr has written an informed and persuasive book which recognises 
that parents are far more able to take positive action to assist their children's development 
than some psychologists have previously believed. Its publication by a house with wide- 
spread marketing facilities should ensure that it reaches them. 

While Dr. Carr's book makes a positive contribution to the educational needs of men- 
tally handicapped children, Mr Shearer's book draws attention to the failure of our health 
and social services to meet satisfactorily any of the other needs of handicapped children in 
residential care. 

Starting with the Curtis Report (1946), the reports of committees, evidence presented to 
Royal Commissions and legislation concerned with children living away from home is selec- 
tively considered with particular emphasis on the way in which official action seems to suffer 
from a delayed response of about thirty years in comparison to action taken on behalf of 
non-handicapped children. It is uncomfortable to read the Curtis Report's description of 
hospital accommodation consisting of ‘ decayed wood huts which had been condemned 
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before the war ' (p. 11) and disheartening to then encounter the Development Team's 
Report in 1978 of * most appalling buildings and cramped conditions without even adequate 
facilities for washing, bathing, or the storage of clothing,’ (p. 79). 

Evidence indicating the poverty of affection and personal interest, the lack of consistency 
and the dearth of opportunities for play and development progress recurs. However, the 
book is far from being simply accusatory. A final chapter entitled ‘ What Next?’ indicates 
clear priorities for legal and administrative action in the future. The need for relocating 
mentally and physically handicapped children and the importance of adoption, fostering and. 
the provision of ordinary and specialist children’s homes in the community are argued 
convincingly. There is also the inevitable exhortation for funds to be made available for 
these purposes. Surprisingly, considering the nature of the book’s content, it is neither tedious 
to read nor are the arguments emotional. The presentation is balanced and reasoned with 
an understandable element of impatience intruding occasionally. 

It is unfortunate that the recent ‘ shelving ’ of the Jay Committee Report renders it most 
improbable that change will occur quickly but this should not detract from the book’s 
usefulness to psychologists and students. When read alongside a more detailed and compre- 
hensive set of recommendations such as those contained in Mittler’s recent book it gives a 
historical perspective which will be valuable to those trying to influence social policy. 


ТАМ BERRY 


REFERENCE 
MITTLER, P. (1979). People not Patients. London: Methuen. 


KNAPPER, С. К. (1980). Evaluating Instructional Technology. London: Croom Helm, 
рр. ix+163, с. £11-50. 


This is the fifth volume in the New Patterns of Learning series, edited. by Philip Hills of 
Leicester University. The series is aimed at ‘all educators, trainers and administrators in 
higher, further and continuing education ’, and is intended to ‘ provide readable introductions 
to trends and areas of current thinking i in education ’. 

The scope of the volume under review is greater than the title might suggest and much of 
the book is devoted to a survey of the problems encountered in trying to evaluate teaching 
and learning in all phases of education from the single class or lecture to the whole course or 
programme. Instructional technology is broadly defined as ‘ the systematic design and imple- 
mentation of various technological devices that may supplant or supplement the human 
instructor’. Aids are considered only in so far as they form an integral part of the teaching/ 
learning system, and the author is not primarily concerned with educational technology, 
which is covered by an earlier book in the series. 

Dr. Knapper is Teaching Resource Person and Professor of Environmental Studies at the 
University of Waterloo, Ontario, Canada. His approach is refreshingly comprehensive 
and at the end of the book he provides an excellent annotated bibliography of some forty 
references, covering studies from his own country, from the USA, from the UK and from 
UNESCO reports. Apart from Bloom’s classic Taxonomy of Educational Objectives (1956), 
all these were published during the last decade. 

The first part of the book deals with the changes in education brought about by the 
introduction of programmed learning and teaching machines in the 1950s and 1960s, and by 
the stress on formulating educational aims and objectives. Reference is made to some of the 
pioneers in these fields, such as Pressey, Skinner, Keller, Postlethwait, Bloom and Gagné, 
while the organisation 'and delivery of instructional technology is considered in terms of 
enthusiastic individual teachers, media centres, and such institutions as the Open University. 
Criteria for evaluation are discussed and various methods of assessing whether these criteria 
have been met (descriptive, experimental, correlational) are critically analysed. 

The generalities and reservations of the first four chapters (some two-thirds of the book) 
no doubt provide a necessary survey and introduction for those unfamiliar with this field, but 
the more knowledgeable reader may well be glad to arrive at a whole chapter devoted to a 
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detailed and critical study of some concrete examples. Four specific cases are considered: 
a course on particle mechanics taught by the Keller Plan at the University of Surrey; a 
comparison between the lecture and the computer at the University of Waterloo; the influence 
of teachers’ expectations on programmed learning at an English secondary school; and a 
large American survey of new teaching methods in eighty schools. 

In the last two chapters instructional technology itself is evaluated and some lessons for 
the future are offered. One can but agree that there is a need to optimise individual learning, 
and to discover the circumstances in which particular technologies can be most effective in 
promoting learning. Also, in view of the increase in unemployed, it is obviously important 
to extend evaluation to achievements in later life. If this book helps to bring about these 
Jaudable aims then it will indeed have served its purpose. In anything, this reviewer was 
rather disappointed that the author was not more radical in his views, but perhaps this was 
due to reading too much into the author's claim that his approach was somewhat idiosyn- 
cratic. 

One other disappointment was with the presentation of the book. It is а slim volume 
and the price is very high compared with other similar publications, hence one would have 
expected perfection in the layout and printing. A number of deficiencies make the book fall 
short of this expectation. Ног instance, the complicated learning matrix proposed by Rockart 
and Scott Morton is almost unreadably squashed onto one page, although Bloom's types of 
learning occupy two; the beginning of a paragraph is missing on page 121; and a letter has 
been omitted from ‘ Open University’ on page 139. Small points, perhaps, but Knapper 
includes cost-benefit among his criteria of evaluation and publishers should keep in mind the 
competing demands on the educator's limited financial resources. 

G. W. H. LEYTHAM 


RADFORD, J. and Govier, E. (Eds.) (1980). А Textbook of Psychology. London: 
Sheldon Press, pp. xi 4-766, р. £8:50, с. £25-00 


The stable is North-East London Polytechnic; the race tracks are where candidates 
aspire to ССБ A level in Psychology; the attraction for the punters is ' the first British 
text book in the field for ten years °, and to parody Macaulay the standpoint is androcentric. 
What of the book? The principal editor requires psychology to be objective, empirical, 
eclectic and humane—in the British tradition of Galton, Bartlett, Burt and Broadbent. Yes, 
Burt. Despite the fabrication (acknowledged), he did make a useful contribution so a bonus 
point to Radford, for judgment in saying so. 

The first section gives a potted history of the schools by reference to their proponents, 
almost like a single volume encyclopedia of world knowledge, and bears tribute to the effects 
of our appalling examination systems. However, the statistical chapters are better as the 
techniques are related to the purpose and parametric and non-parametric data distinguished, 
something often lacking in beginners' textbooks. 

The second section grouping of psychology as a natural science is the usual physiology, 
basic neurology and their relationship to conditioning and behaviour, and the third section 
with a new style name of * Information Processing ' has the old style collection of perception, 
attention, memory and intelligence with an additional chapter on manipulating information. 
The trouble with a textbook is that you must meet the whims of the examiner. Other- 
wise why are psycho-physical methods mentioned? The word ' manipulation ' in the last 
chapter is not pejorative but the title refers to language, concepts, thinking and problem- 
solving. 

Section four is a continuation of the previous one on the individual but this time with a 
developmental slant. Even if we must have Plato, Locke and Rousseau, which is doubtful, 
can Stanley Hall be justified, especially when Godfrey Thomson is missing? A concentration 
on fewer names with more exposition would have been preferable. 

The last section on social aspects is too compressed but as attitudes are the major issue 
at least the emphasis is correct. 
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А bibliography, a set of statistical tables and an index complete the book. Some of the 
references would be extremely hard to find, the statistical tables seem unnecessary and the 
index is good. 

As the editors are now aware, the production of a general textbook is no easy task. We 
all tend to favour the one we ourselves were weaned on, but why are British, i.e. English, 
textbooks on psychology so crushingly dull? OM 

. J. MORRIS 


SurroN-SurrH, В. (Ed.) (1980). Play and Learning. Chichester: Wiley, pp. vii+335, 
с. £14-60. 


This volume consists of papers presented at a symposium on Play and Learning held in 
1978 and sponsored by Johnson & Johnson Baby Products. The editor courageously takes it 
upon himself to play a full role, supplying us with a foreword, numerous linking commen- 
taries, and an epilogue. The individual papers themselves are relatively brief but the dis- 
cussions that follow appear to be presented in fair detail. Studies of infants and young 
children naturally feature most prominently, although there is an interesting chapter on adult 
play and the phenomenological concept of flow by Mihaly Csikszentmihayli, which excited 
the imagination of the participants. Other topics include: Play as Arousal Modulation, 
Stages in Play Development between Birth and Two Years, Exploration and Play, Social 
Play, Individual differences in Styles of Play during Infancy, Play and Speech, and Anthro- 
pological Perspectives on Play. 

It is a book which I think would be found most rewarding by those with some familiarity 
with the research area. The beginning student would not be helped by the succinctness of 
the papers, which often contain tightly packed ideas needing considerable expansion. The 
discussions after each paper might have unpacked theoretical issues satisfactorily for the 
participants but could be merely confusing to the uninitiated, and these discussions in fact, form 
the bulk of the book. The underlying problem i is, of course, the existence of so many different 

perspectives on the subject, each with its own language. Among the concepts of play the 
editor kindly abstracts for the reader from the contributors are: play as arousal modulation, 
manipulation of end-means behaviour, non-prototypic variability, correction and resolution 
of uncertainty, self-generative processing, the manipulation of alternative frames, flow, 
envisagement of possible realms. All these are set before us as wares in an Arab bazaar. 

Thus, if the symposium has one theme, it is that of the apparent impossibility of defining 
play. Lively discussions circulate around this problem and present us with brief glimpses of 
possible solutions. Those psychologists of play who are ready to throw their established 
theories into the melting pot might conceivably profit from this account. 

NEL BOLTON 
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PARENTAL: INVOLVEMENT IN LANGUAGE 
DEVELOPMENT: AN EVALUATION OF A SCHOOL- 
BASED PARENTAL ASSISTANCE PLAN 


By M. BEVERIDGE anp ANN JERRAMS 
(University of Manchester) 


SUMMARY. Four groups of 10 nursery school children were selected such that each 
child was matched on EPVT, Reynell Language Development Scales and Raven’s 
Progressive Matrices with one member of each of the other groups. A Parental Assis- 
tance Plan (PAP), designed to help parents to facilitate their children’s language develop- 
ment, was devised. This-plan involved the parents in working on language with their 
children at home for 20 minutes per day, and familiarisation with the plan brought the 
parents into school for one half day per week for 12 weeks. The mothers of two of the 
groups received the PAP and worked with their children accordingly. One of these 
two groups also received the Distar Language programme. A third group received 
only the Distar and the fourth group received no formal intervention but merely played 
with toys in the presence of a teacher for an equivalent ‘period. Results of an immediate 
post-test and another 18 months later showed that both the groups whose parents 
participated in the PAP showed significantly greater increase in language development. 


INTRODUCTION 


Ir is now more than a decade since the Plowden Report argued that “ nursery 
education will only succeed to the full if it carries the parents into partnership ” 


While no one has publicly disagreed with this aim, little has been done to describe 
exactly what it would mean in terms of school practice. Proposals with regard to 
improving parent-teacher relationships do appear in the Plowden Report, but these 
proposals are only likely to be effective with middle class parents who are " tuned in 
to the same cultural wavelength ” as teachers (Davie et al., 1972). 


However, as Haigh (1977) states, to assume working class parents are not 
interested in their children's education is one of the most damaging false assumptions 
which teachers and educationists can make. What is true, Haigh claims, is that these 
parents find difficulty in communicating their concern; thus the problem is seen as 
one of communication but not indifference. 


This view is also supported by Tizard (1977) who, in a survey on parental 
involvement in nursery education, referred to a number of parents who stated that 
they would like to help their children with their education, but did not do so because 
they did not know how to, or because they were afraid of interfering. Almost all the 
parents interviewed in the Tizard survey had older children in the primary school, 
and it seemed that the experience of school had tended to undermine the parents' 
confidence in their ability to help their children. 


These parents were not critical of the nurseries in any way and appeared 
appreciative of the teachers’ efforts. However, they tended to think of the nursery 
as а socialising experience for the child and little else: it would appear that the 
educational implications of nursery education were unforeseen. 


The teachers, on the other hand, especially in Social Priority (SP) areas, saw 
their function as remedying the deficits of the home; as Bernstein (1971) argues, it is 
this wedge progressively driven between the child as a member of the school and the 
child as a member of a family and community that must be broken down. The 
child and his parents are too often expected to drop their ‘ social identity’ at the 
school gate because their culture is seen as ‘ deprived ' by the teachers. 
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From the Tizard survey it appears that many of these teachers are critical of the 
parents for not providing the necessary stimulus at home, namely, stimulating 
conversation, books and suitable toys. One teacher comment in the report states: 


“Tf only the parents could understand what we do at school, then it could be 
‘nursery ’ for the child all day.” 


Yet, unless specifically asked by the parents, none of the teachers made any 
concrete suggestions to the parents as to how they could help their children at home. 
And when invitations were extended to the parents to visit the nursery very few 
parents responded. However, the survey discovered that most parents declined the 
invitations because they felt ‘ill at ease’ in the classroom, uncertain of what they 
were meant to do and totally unaware of the educational implications which the 
teachers had hoped would be * obvious’. 


It would appear that the resentment which many teachers feel for these parents 
stems from ignorance on both sides. The teachers think the parents don't care, 
whilst the parents remain uninvolved because of a total unawareness of what the 
teachers require from them. And this gulf of ignorance will remain unless attempts 
are made to overcome the misunderstandings of both teacher and parent. As Hymes 
(1974) states: 


** We have to end the separation of home and school—too much is at stake to 
. let the foolish lack of communication persist—the left hand must know what the 
.right is doing, for nowhere in the long educational continuum is the parent- 
teacher relationship more important than in the child's early years." 


How, then if at all, can parents in а Social Priority Area be helped to develop 
their children's language and learning skills? The paucity of research examining 
this important issue is perhaps due to Prachi difficulties. Woodhead (1976) writes: 


“The problems facing any scheme to use the school as а base for parent educa- 
tion—namely how to design such an operation which is within the resources of 
the school and staffing, and how to encourage these particular parents to partici- 
pate, is problematic. It is these dilemmas, never mind the problem of how to 
evaluate such a scheme, which seem so effectively to have discouraged researchers 
from taking on the task." 


The research reported in the present paper addresses some of these issues by a 
м Parental Assistance Plan’. This was designed to give the parents ап 
awareness of their personal contribution in assisting in their child's language develop- 
ment, and to create parent-teacher dialogue in a classroom setting. 

In compiling the ‘ Parental Assistance Plan’ attention was given to developing 
the basic Janguage teaching points in a sufficiently interesting way to maintain 
attendance and parental interest over a period of 12 consecutive weeks. The plan 
incorporated discussions, a film, demonstrations and practical experience. 


METHOD 


The Parental Assistance Plan (Based on the Renfrewshire Pre-school Language 
Programme) 


The plan was arranged in two parts on each weekly session: 


Part 1—talking to the parents and demonstrating teaching points; 


Part 2—8 practical demonstration with a small group of children to reinforce the 
teaching points in Part 1. 
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This procedure appeared to be satisfactory to the majority of parents as indicated 
by the following comments taken from recorded interviews on completion of the 
project: 


** I wouldn't want a full hour talking—talking wasn't the same, when the children 
came in you could see what to do." 


** [ liked the split up where children came in to demonstrate, 'cos that gave you 
the idea how a teacher does it—it looks easy but it's not." 


The 12 weekly sessions proceeded, using this format, until the final programme 
when the parents were taken to visit a display of toys during which they were asked 
to make specific observations. The sessions are outlined in Appendix I. 


The evaluation of the PAP 


School context 

The aim of this research was to evaluate the use of the PAP with parents living 
in а Social Priority Area. The first requirement was a school in such an area which 
would provide the necessary facilities for carrying out a research project of this kind, . 
and in particular a staff who would co-operate. А Manchester city infant nursery 
School was located which agreed to participate and offered in principle the full 
co-operation of all teachers to be involved. 


The school itself was a traditional building built in the early 1930s and had; at 
the time of the research project, 388 pupils. This number also included part-time 
nursery children. The staff numbered one headteacher, nine infant teachers, three 
nursery teachers and a number of NNEB trained ancillary aides. 


The philosophy and climate of the school was one where literacy takes a high 
priority, and a * reading hour was set aside each day when children of differing 
abilities, regardless of age, joined a specific teacher for any special requirements for 
reading attainment. Particular emphasis was laid on the teaching of phonics, and an 
eclectic approach was used in‘the teaching of reading. These were combined with a 
* thematic’ approach to learning, ‘ to aid the child's conceptual development ’. 


The school's catchment area is mainly inhabited by unskilled working class 
families of the Registrar General's Social Class V. The surrounding property is a 
vast sprawling estate built also in the early 1930s and was one of Manchester's first 
housing complexes. As a Social Priority Area, it has many of the problems which 
demand considerable attention by the social services. This area was recently quoted 
as having the highest crime rate throughout the whole of Europe, and offenders are 
said to be becoming younger. Psychiatric illness is relatively common, and many 
mothers were treated for depression and neurotic symptoms. Some of the mothers 
had made suicide attempts. Yet those parents who participated in the project, and 
many others who wished to become involved, showed a real concern for their children 
and wished a better future for them. 


Procedure 

The evaluation of the PAP took the form of measuring its effect on children's 
language as compared with the effects of a recognised language intervention programme 
not involving parents. This latter programme was * Distar ' (Englemann and Osborn, 
1969). 


Sampling 

Forty children, comprising four groups of 10, participated in the research; 
group 1 received the Distar programme only; group 2 received both Distar and the 
PAP; group 3 received the PAP only and group 4 did not receive any planned 
programme but merely played with toys for 20 minutes in the company of a teacher. 
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The four groups were formed from 20 boys and 20 girls, ranging in age from 3 
years 5 months to 4 years 5 months. The children had been assessed on Raven’s 
Progressive Matrices, Reynell Comprehension and Expressive Language Scales and 
the English Picture Vocabulary Test (EPVT). These 40 children were selected so that 
each group member was closely matched on these measures to one member of each of 
the other groups. The four groups were thus made up of 10 matched quartets. There 
were no significant differences (t-test) between any pair of groups in the initial 
assessments. 


Programme allocation 

The two groups receiving Distar were each divided for programming into two 
sub-groups of five children as recommended in the manual. The five less advanced 
children in each group were all placed in the same sub-group. Each day the teachers 
programmed two sub-groups in the morning and the others in the afternoon. But as 
young children become more tired in the afternoon, and are therefore less likely to 
profit from instruction, the teachers alternated weekly the time of day for programming 
each sub-group. The Distar programming continued on each school day for 12 
weeks. 


The 20 parents whose children constituted Groups 2 and 3 were divided into two 
groups of 10, and assigned according to choice, to one of two afternoons each week, 
for participation in the PAP. These 12 weekly sessions of one hour, as outlined earlier, 
were designed to help the parents promote the linguistic skills of their children. 
Following these sessions the parents were asked to work at home daily for 20 minutes 
on the theme of the items covered in the PAP. Comments from some parents 
revealed that more than 20 minutes was spent on helping their children. And the 
discussion with the parents indicated that they did all spend at least 20 minutes daily 
working with their children on topics related to the PAP (see Table 1 for summary of 
the allocation of intervention type to each group). 


Post-testing 

АЛ 40 children were post-tested on the Reynell Scales and on the EPVT, once 
immediately after the intervention period and then again 18 months later. These 
post-tests were carried out by two experienced teachers who were unaware of the 
programme received by each child. Unfortunately the post-tests with the Reynell 
Scales showed several of the children in each group to have reached the ceiling scores 
for this test. Hence no conclusive inter-group comparisons could be made using this 
measure. However, the indications from the Reynell post-tests were that they 
supported the results of the EPVT. 


Attitudes of teachers and parents 

Following the 12 sessions, the parents were asked to complete questionnaires 
devised to elicit their views of the PAP. Individual recorded interviews then took 
place when each parent was asked to expand on four specific questions on the 
questionnaire form. These comments were transcribed. Gains made by children in 
the post-tests were given to the parents during the interview. 


The three nursery teachers involved in this project and the headteacher were 
asked to complete a questionnaire. On completion of the questionnaire each teacher, 
including the headteacher, attended a recorded interview where each one was asked to 
expand where necessary on any particular question. А transcription of relevant 
comments was made of each recording. 

Finally all nursery teachers submitted * comment books" which they had been 
asked to keep throughout the research project, and the headteacher wrote a brief 
report on the effects of the research in general. 
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Teachers’ comments on the PAP 

The outcome of the questionnaires and interviews with the teachers indicated 
enthusiasm for, and approval of, the PAP. This is illustrated by the following 
verbatim comments made by the teachers. 


Teacher 4 

“ The mums were able to talk to us—they were very interested in what you were 
doing with them and the books you'd given them—they were happy—quite proud, I 
think, that they were doing this work with you. I find I have gained in several ways 
from participating in this research project. The research has stimulated some parents 
to ask about the activities we involve the children in—and has therefore provided a 
link between parent/teacher on which I can now build." 


Teacher B 

“Т would like to see parental programming continuing. А scheme of this sort 
started in the nursery where there is a more flexible routine could help the parent's 
attitude towards the school as a whole. In these days particularly, we find that the 
school is expected to train the children in all aspects of development, with parents 
opting out more and more; perhaps it is here in the nursery that the parents can be 
shown their responsibility to their children. Parents are not reading to their children, 
not teaching them nursery rhymes, and seem to be unaware how damaging this is to 
the children; therefore it would be better if they could be shown the value of this 
learning by listening to the teachers and participating. I think they would begin to 
extend more aspects of language and conversation in the home and ensure that the 
children gain more from practical experiences at home." 


As these comments indicate, the teachers were in support of the PAP after the 
intervention had taken place. This view was to some degree in contrast to an initial 
scepticism about whether the parents would continue to participate. However, as 
the next section shows, the parents developed a positive attitude to the PAP. 


Parents’ comments on the PAP 

The continued participation of the parents in the PAP is perhaps the most 
important indication of their approval. However, the following parental comments 
give some clues as to why they took to the PAP so readily. 


(Mrs На) “It became a regular pattern coming to school—T've learnt а lot—my 
little girl is on Book 3 (Ladybird) and a parent whose child is still on the 
first book said—it's because you went to them classes and learnt them 
things." 

(Mrs р) “I’ve learnt so many things I didn't know before—those зВарез—Т use 
them myself—I say to my husband—‘ do you know what a hexagon is?’ 
and he doesn’t know. I’ve learnt a lot and it’s done me good ’cos I wasn’t 
so good at school, I was behind—but now I feel I don’t have to ask my 
husband so many things. I never knew what prepositions was.” 


(Mrs Par) “It’s got me to the stage where I can tell stories now and they'll sit and 
listen—before I couldn't tell stories for peanuts—I only knew опе before— 
* Rosie's Walk ' and that was only ’cos it was learnt me.” 


(Mrs S) “І used to relay everything back to my husband—he did nearly everything 
on those Handouts—he ticked them off as he did them.” (The husband 
carried out the programme as the wife worked evening shifts.) This was 
the one member who claimed in her questionnaire that the programmes 
had helped * very little ’. 
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(Mrs С) “J think all parents should have the same chance as the group had when 
their children are about two and a half years old. The programme is very 
useful, I'm only sorry that I did not have the know-how, when my two 
children were younger. Starting younger would have been better, sort of 
making a mould so that when they go to school, it's not such a strain on 
them, and they would want to learn more. Other parents are interested. 
I took my programme units to work each week after a meeting, and I 
explained to some of the girls what we had been doing. They would go 
home and try some of the things with their kids. They would say they 
loved the story or they would say he's thick, he doesn't even know what 
above is, or under. (Like I did.) When I explained to them that at first 
I thought this, but how can they know if we don't teach them, you just 
take it for granted that when they get to school they will learn everything 
at once." 


(Mrs Par) “Т think the programme was a great value to both parents and child 
because it gets you to talk with your child and also at storytime you and 
your child can get in the story which you are telling at the time. The 
finger-play and rhyme give parents the chance to act a little childish and 
this makes the child think you are his/her friend which as his mother 
brings you both closer together. Other parents should find them useful 
but it all depends on the parent." 


The above comments indicate that the parents were pleased to be involved in the 
project, and, in general, felt good about being able to help their children. There is 
always a chance that the parents are merely saying what they think is required of 
them; however, all our indications from informal contact with the school are that 
the above comments are genuinely motivated. 


RESULTS 


Table 1 gives the results of the pre-test in June 1977 and the results of both the 
post-tests, the first in January, 1978, and the second in June, 1979. In presenting 
these results 20 points were added to the scores on Test I of the EPVT, a procedure 
which converts Test I scores into a scale which can be directly compared with scores 
on the pre-school version. This procedure does marginally underestimate any gains 
made by all the groups, but it is most likely to depress the scores of the groups showing 
greatest improvement and will not therefore increase the significance of any differences 
between the group improvements in performance. 


The design of the study is suited to the construction of an ANOVA table with 
three sets of variables. These were the four groups, the two post-tests and the nine 
quartets which feature in both post-tests (one child from each group was unavailable 
for the second post-test). Table 2 gives the results of this ANOVA. From Table 2 
two relevant F ratios can be calculated. Comparison of the variance due to the groups 
and the groups and quartets combined gives an F ratio of 4-455 with 3 and 24 degrees 
of freedom (Р<0:025). This demonstrates a significant intervention effect which 
persists over time. 


In order to find the source of the intervention effect, inter-group comparisons 
were made using a t-test. This procedure revealed just two significant differences 
between groups; group 2 was significantly different from group 4 (t = 2:136, df 24, 
P «0-05) and group 3 was also significantly different from group 4 (t = 2:26, df 24, 
P«0-05), with group 4 doing less well in both cases. The significant intervention 
effect is therefore due to the improvements made by the two groups involving the 
PAP compared with the control group. 
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TABLE 1 


MEANS AND STANDARD DEVIATIONS OF THE EPVT Score FOR EACH GROUP ON 
THE PRE-TEST AND THE Two Posr-TESTS FOR THE INTERVENTION CONDITIONS 








June 1977 Jan. 1978 June 1979 
test Post-test 2nd Post-test 
EPVT Score EPVT Score EPVT Score 
Group 1 
Distar Mean 212 27-8 37.5 
опіу SD 8-65 9-90 6°75 
Group 2 
Distar + Mean 20-8 31-8 382 
PAP SD 8-75 8:96 6:28 
Group 3 
PAP Mean 21:0 30-2 410 
опіу SD 9-10 9-55 424 
Group 4 
Control Mean 213 25:8 35:8 
(Table toys) SD 7.77 792 5-78 
TABLE 2 


THE ANOVA or EPVT Score БОК THE FOUR 
GROUPS, THE Two Post-TESTS AND THE NINE 


QUARTETS 
Sum of Variance 
Source Squares df Estimate 
V 1 Groups 273-708 3 91-236 
V 2 Occasions 1672-3471 1 1672-347 
V 3 Quartets 2840-028 8 355-003 
У 1 апа 2 65-153 2 21-718 
Viand3 491-417 24 20-476 
V 2 and 3 417-528 8 52-191 
Residual 231.472 24 9:645 

Total 5991-653 71 


The F ratio derived by comparing the combined variance of the groups and tests 
with the residual variance shows that there is no differential growth across matched 
quartets over time as a function of intervention (F = 2:252, df 3, 24, NS). There is 
thus no systematic variation in the effects of the interventions as a function of the 
initial ability of the children. 


The Е ratio of the ANCOVA shown in Table 3 (Е = 0-635, df 3, 28, NS) tests the 
significance of the difference in the slope of the lines joining the first and second 
post-tests. (See Figure 1) As this Е ratio is considerably below 1-0 this result 
indicates that any apparent differences in post-treatment growth rates is attributable 
to chance. The within-group regression of first post-test score on second post-test 
score of Table 3 is 0-434 with variance of 0-0109. This regression will 95 per cent 
of the time fall between 0-226 and 0-642. "These fairly tight limits indicate that the 
failure to find a significant difference in post-intervention growth rates cannot simply 
be attributed to lack of power in the comparison due to small within-sample numbers. 
Hence it is justifiable to claim that the intervention results in a new starting point from 
which development proceeds at the normal rate, rather than that it creates a short-term 
acceleration. 
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FIGURE 1 
THE MEAN SCORES OF THE FOUR GROUPS ON THE FIRST AND SECOND POST-TESTS ом БРУТ 


ERVT. SCORE. 





TABLE 3 


THE ANCOVA OF THE FOUR GROUPS PERFORMANCE ON FIRST AND 
SECOND PosT-TESTS ом ВРУТ 





Source df Xx" Уху Xy? df Mean Sq. 
Within 1 8 889 365 365 7 
2 8 722 417 316 7 
3 8 771 224 144 7 
4 8 733 346 264 7 
28 16-79 
Total 32 3115 1352 1089 31 — 
Between 3 10:67 


DISCUSSION 


The results of this study indicate that both interventions which involved the 
Parental Assistance Plan were effective in improving the language skills of the 
children. The intervention which used only Distar also improved the children's 
performance but this improvement was not significant over the two post-test scores. 


The effects of the PAP may well be due to the direct teaching given to the children 
by their parents. However, it should be remembered that a minimal intervention 
plan can also produce significant and positive changes in parents' language to their 
children (Cheseldine and MoConkey, 1979). And if the language environment of the 
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children has been affected in this way then some of the gains made by the children 
might be attributed to this change. The children’s initial ability was not related to 
the gains made by the intervention. This result is encouraging for the generalisability 
of the PAP over ability levels, and it also suggests that intervention should not be 
restricted to children at one ability level. 


The questionnaire and interviews with the parents revealed that they found the 
PAP interesting and enjoyable, a result which supports the views of Haigh quoted 
earlier. The teachers' attitude to the PAP was similarly positive; however, others 
were distinctly negative in their comments on the Distar. The teachers indicated 
dislike of the formal mode of instruction which is characteristic of the Distar 
programme. It might be argued that the teachers’ attitude was responsible for the 
relatively poor performance of the Distar group. However, dislike of the Distar 
does militate against its successful use in the school system. If we are to provide 
intervention programmes which have any widespread effect, they must be popular 
with the teachers who will have to implement them. Оп the basis of our evidence 
Distar is unlikely to be used with any enthusiasm in schools, and given the choice 
teachers are unlikely to use it at all. 


The recent re-evaluation of the American Headstart programme has suggested 
some benefits of early intervention which last throughout school (Lazar et al., 1978). 
That programme also concludes that home-based intervention is no more beneficial 
than school-based intervention. However, the aim of the present study was to 
overcome the lack of dialogue between home and school, on the assumption that 
intervention which is totally home- or school-based is unlikely to gain maximum 
effect from the effort involved. In this respect the present study shares the aim of 
Donachy (1976), who also showed some significant gains as a result of an intervention 
project. Donachy suggests that an important outcome of his study may have been 
to make parents and teachers aware that their co-operation is a viable venture, and 
the persistence of the gains in the present study over а period of 18 months may well 
be due to continual parent-teacher dialogue. 


It remains therefore to conclude that parents in a Social Priority Area will 
involve themselves with schools in the promotion of their children's linguistic skills, 
provided that resources are made available. And, furthermore, the results of the 
study reported here suggest that the children benefit from this involvement. 
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APPENDIX I 
Session 1 
Introductory talk; Explanation of weekly programmes; The aim of the project and film. 
Session 2 


Story-telling techniques; Types of stories; Demonstration of telling a story; Five rhymes demons- 
trated with five children present; Story book and book of rhymes distributed to each parent. 


Session 3 
Continuation of story-telling techniques; Plus rhymes and finger plays; Difficulties discussed 
from previous week's session; Further demonstration of finger plays and action songs. 


Session 4 
Extending story-time into discussion; Importance of communicating with the child; Simple 
explanation of language development; The importance of extending language through pictures; 
Demonstration of poems and rhymes on a given theme "Iransport'; Handout: Story-telling 
techniques. 


Session 5 

Discussion on problems from previous week's work; Handout: Extension of language through 
story-telling, poems or rhymes; Demonstration of specific rhyming books; Parents attended 
‘Storytime’ in the nursery; Three teachers took three groups; Parents asked to look out for: 
(a) children who concentrated; (b) children who listen and understand by their questions; 
(c) disturbed children lacking in concentration; (d) general techniques of storytelling to larger 
groups of children. 

Parents did mot attend a group which involved their own child. 


Session 6 
Discussion of problems; Number books to increase number concepts and language; Shapes; 
Puzzles; Teaching colours; Handout-- Mother/child interaction (taken from Tough’s “Сот- 
munication Skills in Early Childhood"); The necessity for explanations; Discussing contents of 
pictures; Naming of objects; Questions to promote thinking; Open, closed, enabling questions; 
Demonstration of above with three children with pictures. 


Session 7 
Discussion of problems; Demonstrating the importance of prepositions (using farmyard animals); 
Handout: Use of prepositions; Joining library; Introducing Ladybird “Talk About" books and 
0-5 years pamphets “The Importance of Talk”. 


Session 8 
О Discussion: (a) Observations, successes and failures encountered during half-term holiday; 
(b) Use of handout during vacation; (c) Story: ability to extend discussion linked to a story; 
(d) Pictures: success and failure with regard to questioning (3 types: closed, open, enabling). 
(2) Introduction to new handout: (a) Classifications; (b) Categorisation; (c) Odd man out; 
(d) Similarities; (e) Opposites. 
(3) (a) Introduced book “Bears in the Night” to reinforce work on prepositions; (6) Recording 
of advanced 4-year-old talking (World Jigsaw); (c) Checked all parents have now joined library; 
(d) Recorded entire session—evidence of types of programming. 


Session 9 
(1) (a) Success and failures with regard to handout on classifications and categories; (b) Use of 
why-—because—etc.; Necessity of explanations for every child. 


(2) Provided new storybooks and demonstrated a short story and nonsense rhymes; Pointed out 
у of speech and speech patterns grammatica] structures; Importance of above for verbal 
uency. 
(3) Introduced educational toys: (a) Toys which teach: shape; colour; size; (b) Demonstrated 
each specific toy; (c) Allowed all parents to use and experiment with the toys. 
(4) The importance of toys in a child's play. 
(5) Date given out for visit to Galts—Christmas visit. 
Session 10 
(1) Discussion— problems, success, failures, 
(2) Introduces handout on: (a) Shapes; (b) Number language, e.g., more/less comparatives etc. 
(3) Introduced new number rhymes. 
(4) Practical session; All parents made different shapes and sizes with card. 
(5) Demonstration of how to use the cards and shapes. 
(6) Extended this into matching shapes. 
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Session 11 
(1) Discussion— successes and failures with regard to shapes made in programme 10; Different 
ways in which parents used the shapes; Concentration of size and colour as well as shape. 
(2) Introduces handout. (a) Very simplified demonstration of Piaget's work on conservation of 
number; (b) Reversibility; (c) One to one correspondence; (d) Use of more than, less than, etc. 
Demonstration with two children: 1 child had лог acquired conservation of number, 1 child 
had done so. 

Session 12 


Visit to Galts (parents taken in minibus). 
Observations: Турез of educational toys; uns Durability; Price—value for money; Ages 


and stages—what the toy is teaching the chil 
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THE USE OF ABSOLUTE AND RELATIVE CODES IN 
CHILDREN’S DISCRIMINATION LEARNING 


By SUSAN D. PRYCE AND А. M. SLATER 
(Department of Psychology, University of Exeter) 


SUMMARY. Four experiments are described in which children's and adults" abilities to 
detect and respond to the absolute and relative features of stimuli varying in size were 
investigated. The first two experiments confirmed and extended previous findings and 
indicated that young children depend primarily on relative codes and have difficulties 
in registering the absolute values of stimuli. In Experiment 3 this finding was reversed 
and four-year-old children (a) performed best in a learning task in which both the 
absolute and relative components of the stimuli were held constant, and (5) learned 
more quickly in a task requiring abstraction of the absolute components of the stimuli 
than in one where relative cues only were available. Differences in the methodologies 
of the two designs were explored in Experiment 4 in order to resolve the apparently 
contradictory nature of these findings. Overall, the results of the four experiments 
Jead to the conclusion that relational responding to a stimulus that is of intermediate 
size is more difficult than responding to the absolute properties of a stimulus, which in 
turn is more difficult than responding to a simple relational cue such as smallest or largest. 

The results of the experiments show that children from an early age are able to 
abstract and respond both to the relative and absolute properties of stimuli, and that 
the features responded to in a particular experiment are determined primarily, not by 
age, but by the characteristics of the experimental design and by the nature of the absolute 
or relative response required. 


INTRODUCTION 


WHEN faced with certain types of discrimination learning tasks children often exhibit 
a preference for coding the relationships between two stimuli rather than the absolute 
features of the stimuli (e.g., Kohler, 1920; Johnson and Zara, 1960; Graham et al., 
1964; Lawrenson and Bryant, 1972; Lawrenson, 1974). In a typical task subjects 
are trained to choose the larger (i.e., B) of two stimuli, À and B. When they have 
learned to do this successfully they are presented with stimulus B paired with, say, a 
larger stimulus, C. This presentation is called a transposition. If the subjects then 
choose the largest stimulus this is a relative choice in that it indicates that they have 
learned to respond to the relative sizes of the two training stimuli rather than their 
absolute size. If, however, subjects choose stimulus B, this is an absolute response in 
that they appear to be responding to the absolute properties, i.e., ‘ sameness ', of the 
previously rewarded stimulus. Typically, young children will make a relative response 
when stimulus C is only one step removed from stimulus B (a near transposition); 
where it is removed more than one step along the size continuum from the training 
pair (a far transposition) 3- and 4-year-olds now respond randomly but 6- and 
7-уеаг-014$ continue with a relative choice. 


Early researchers have used the transposition test to support their theories 
about developmental trends in the use of absolute and relative codes. Such studies 
have been criticised on at least two grounds. First, the tests involve the presentation 
of a novel stimulus which may be disruptive and introduce a random element in 
responding. Second, they do not take into consideration the possible significance of 
the relationship between the stimuli and their background: if such a relationship is 
utilised in the learning stage then the stimuli introduced, especially in a far test, may 
disrupt subjects’ performance because the new stimuli will form a different relation- 
ship with the background, being several steps removed from the training pair. To 
avoid these difficulties of interpretation, Graham её al. (1964) and Lawrenson (1974) 
carried out experiments in which preference for absolute and relative codes was 
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investigated without the introduction of the traditional transposition test. In their 
studies a comparison was made between the number of trials taken by subjects to 
learn tasks which could be solved either by the use of an absolute or by the use of a 
relative code. Their results showed that tasks which could be best solved by relative 
coding were learned more quickly and with fewer mistakes. There is some support 
for the notion that the stimulus-to-background ratio is coded during normal dis- 
crimination learning (e.g., Riley, 1958; Corbascio, 1964; Lawrenson and Bratus, 
1976). The most recent and comprehensive theory to incorporate ratio coding is that 
put forward by Reese (1968). 


Whether there is a developmental trend in children's use of relative and absolute 
codes is an interesting question which at present has not been fully resolved. Early 
researchers (e.g., Spence, 1942; Kuenne, 1946) suggested that there is a developmental 
trend such that absolute coding was initially preferred; relative responding became 
the dominant mode of response when the children acquired verbal mediational labels, 
such as ‘ bigger’ and ‘smaller’. This experimental hypothesis has received little 
support. The available evidence suggests that, if anything, the reverse trend occurs; 
certainly as we have seen above, young children seem to prefer responding using a 
relative code. Lawrenson (1974) reported that 3- to 7-year-olds, though preferring 
the relative task, showed an improvement in performance on both tasks with age, thus 
there did not seem to be a qualitative shift in responding. Bryant (1974) has suggested 
that while both children and adults show a greater facility for relative codes than for 
absolute codes, adults do possess some absolute codes and young children do not. 
His theory begins with the premise “.. . that young children can on the whole register 
and remember relative values with great ease, but have problems in situations in 
which they must remember absolute values along any continuum ". He suggests that 
* young children have considerable problems in understanding the rules governing 
their environment simply because they cannot remember the absolute properties of 
objects around them, and that as they grow older they begin to develop some 
strategies for coping with them " (p. 14). 

The absolute/relative question remains of interest but no clear developmental 
trend is fully supported. The most likely possible developmental trends appear to be 
twofold: first, that children respond exclusively or primarily on the basis of relative 
cues up to the age of about five or six years and, thereafter, the use of absolute coding 
becomes possible; a second possibility is that the developmental progression is 
purely quantitative in that both types of code are available at all ages and that per- 
formance on both the relative and absolute task improves with age. А development 
of this hypothesis might be to say that children will use relative codes in some 
situations and absolute codes in others and that much of the existing confusion results 
from different researchers using different experimental situations. 

We describe here a series of experiments designed to clarify some of these issues. 
The first of these was an investigation into developmental trends in the use of absolute 
and relative codes. Experiments 2, 3 and 4 were carried out to explore further children's 
uses of relative and absolute codes. 


EXPERIMENT 1 
In this experiment 4-year-olds, 8-year-olds and adult subjects were required to 
solve a discrimination learning task. For half of the subjects the task could be solved 
by the use of relative coding and for half by the use of absolute coding. 


METHOD 
Subjects 
The subjects were 30 playgroup children (mean age 4 yrs 5 mtbs, age range 
3 yrs 8 mths to 4 yrs 10 mths), 32 primary school children (mean age 8 yrs 3 mths, 
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range 7 yrs 10 mths to 8 yrs 8 mths), and. 30 university students (mean age 19 yrs 
6 mths, range 18 yrs 1 mth to 25 yrs 3 mths). 


Materials 

The stimuli were eight white cardboard squares forming a series whose area 
ratio was 1 : 1-4 between adjacent members. The lengths of the individual card sides 
from smallest to largest were 1-41, 1:67, 1-97, 2:34, 2:77, 3-28, 3-88 and 4-59 inches. 


Procedure 

The subjects within each of the three age groups were allocated to one of the two 
experimental conditions: the absolute group (AG) and the relative group (RG). 
Subjects were assigned randomly to the conditions within the following constraints: 
10 of the 4-year-olds to ВС, 20 to АС; 16 of the 8-year-olds to each group; 10 of 
the adults to RG and 20 to AG. The unequal allocation of subjects was to allow for 
an extension of the experiment not reported here. Subjects were presented with a 
series of stimulus pairs. One member of each pair was always stimulus number 4 
(2:34 inch sides), and the paired stimuli were presented in both right and left positions; 
thus, there were 14 stimulus pairs. These pairs were presented to subjects in a 
random order, and when all 14 pairs had been presented a different random ordering 
of the series was used. 


The subject was seated on one side of a table with the experimenter sitting 
opposite and was invited to play a game. He was told that pairs of cards were going 
to be placed on the table and on tbe back of one of them a circle had been drawn. 
His task was to find the circle. A stimulus series (14 pairs) was then presented and 
the subject's choices noted for each pair, a correct choice being rewarded with a 
‘good’ from the experimenter. After every choice the cards were turned over and 
the circle revealed. For the absolute group stimulus number 4 always had the circle 
on the back; for the relative group the larger of the two cards had the circle on the 
back. While a stimulus pair was available all other stimuli were concealed from the 
subject. The task for the 8-year-olds was slightly different from that for the other 
two age groups in that the stimuli did not have any circles drawn on the back and the 
reward used was a small sweet (a Smartie) rather than verbal reinforcement. 


The first experimental session ended when all 14 pairs of the series had been 
presented. Ша subject had not reached the criterion of nine correct choices in ten 
successive trials another series of 14 pairs of stimuli was presented one or two days 
later. In this second session subjects were given some help by the experimenter. 
For example, when an incorrect choice was made, the experimenter would turn over 
the correct card and say " No, this was the correct card; look at it very carefully and 
try to remember it". This help was given only to subjects in the absolute condition 
as the majority of subjects in the relative condition reached criterion in the first 
session and few errors were made by the remaining relative subjects in the second 
session. Ат (Бе end of the experiment subjects were asked how they had known which 
was the correct card. 


RESULTS 


All but two of the 92 subjects reached the criterion of 9 out of 10 correct choices. 
The exceptions were two of the 4-year-olds in the absolute group for whom the 
experiment was terminated after 45 trials. 


Error scores 

Two measures of success were available from the training trials: the number of 
errors made, and the number of trials to criterion. These are shown in Table 1. 
These two measures were highly correlated at all age levels (for 4-year-olds, rho — 
0-94, P«0-001; 8-year-olds, rho = 0-95, P<0-001; adults, rho = 0:94, P 0-001); 
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TABLE 1 
ERROR SCORES AND TRIALS TO CRITERION, EXPERIMENT 1 


Relative Group Absolute Group 





Subjects Errors Trials Errors | Trials 
4-year-olds mean 3-3 17-8 12:33 33-0 
SD 2:98 6:17 
8-year-olds || mean 17 12:6 8:6 261 
SD 2:02 4-6 
Adults mean 0:9 100 74 22.8 
5р 0-57 4-96 


consequently the analysis of results which follows concentrates on the error scores. 
It was not appropriate to carry out an analysis of variance on the data because of the 
unequal cell numbers, but t-tests revealed that for all three age groups significantly 
fewer errors were made in learning the relative task than in learning the absolute 
task (for 4-year-olds, 8-year-olds, and adults, values of і = 4:07, 5:45, and 6:72, 
respectively, all Ps «0-001, two-tailed). 


As is evident from the table, the number of errors made in both the relative and 
absolute groups declined with age. Of the appropriate comparisons (one-tailed), 
three were significant: 4-year-olds made more errors in the absolute condition than 
8-year-olds (7 = 2:31, P « 0-05) and adults (t = 3-2, P 0-01); 4-year-olds made more 
errors than adults in the relative condition (t = 2:53, P<0-05). The percentage of 
the total errors made in the absolute condition was approximately the same for each 
of the age groups (81, 84 and 89 per cent for 4-year-olds, 8-year-olds, and adults, 
respectively). 


Verbal responses 

АП subjects in the relative group were asked whether they had noticed that one 
of the two cards presented each time was the same card. Not one subject reported 
having noticed this, and many expressed some surprise that this was the case. 


The answers given by subjects in the absolute group to the question " How 
could you tell which was the correct card? " were classified under one of three 
headings: Absolute—a response such as " it's the same one ”, " it’s always that опе ”; 
Relative—" not the big one, not the small опе ", “ sometimes the big one and some- 
times the small опе”, * the middle one”; Don’t know—“ don't know ", “I just 
know " or no response at all. The results are shown in Table 2. It is apparent from 


TABLE 2 
SUBJECTS’ VERBAL RESPONSES, ABSOLUTE GROUP, 
EXPERIMENT 1 


Type of response 
Subjects Don’t know Relative Absolute 
4-year-olds 4 14 2 
= 20) 
8-year-olds 3 5 8 
(N = 16) 
Adults 0 0 20 


(N = 20) 


274 Discrimination Learning 


the table that the number of ‘relative’ responses decreased with age, while the 
number of ‘ absolute’ responses increased with age. This change in response with 
age was highly significant (y? = 33:57, df = 4, P 0-001). 


DISCUSSION 


The results lend further support to the well-established finding that young 
children are able to code relationships more easily than absolute features. However, 
аз can be seen from the error scores (Table 1) this seems also to bold for adults and 
older children: they too have problems with absolute codes and the same facility 
with, and apparent preference for, relative codes. Indeed adult subjects, when asked 
what strategies they had employed in solving the absolute task, all reported that they 
had begun by looking for a relationship (1.е., the bigger or smaller stimulus). Clearly, 
extraction of the absolute features of the correct stimulus did not seem to be the 
natural response even of the adults. The results, therefore, support neither of the 
hypotheses which suggest that changes with age are qualitative i in nature (i.e., from 
relative to absolute responding or vice versa). Since the proportion of errors was 
approximately the same in the absolute condition for all age groups the only dis- 
cernible developmental trend is of an improvement in performance in both conditions 
with age. 


Eighteen of the 20 4-year-olds in the present study reached criterion on the 
absolute task in 46 trials or fewer, a result that contrasts with Lawrenson and Bryant 
(1972), who found that only 28 per cent of their 4-year-olds reached criterion in 46 
trials. Two reasons suggest themselves for this. Firstly, our subjects were cued to 
provide an absolute response: had they not been, they would certainly have taken 
longer to acquire it. А second possibility concerns the number of training pairs. 
Lawrenson and Bryant had only two pairs of stimuli available to each subject, while 
in the present study seven pairs were available. It is possible that the larger number 
of training pairs made learning easier because of the additional discrimination 
information present. 


We can make the usual distinction between competence and performance: it may 
be that children are capable from an early age of making absolute responses but that 
they are relatively low down in the response hierarchy or are not spontaneously 
produced in the typical experimental situation. However, the results do not allow 
of an unambiguous interpretation. While the majority of 4-year-olds reached criterion 
in the absolute task they justified their choices with verbal responses that could not be 
readily classified as absolute responses (see Table 2): thus it is not exactly clear what 
these children had acquired when they made an absolute response. In order to 
clarify further the nature of the absolute response in young children the next experi- 
ment was carried out. 


EXPERIMENT 2 


In this experiment 4-year-old children were trained to criterion on an absolute 
task. They were then presented with one or more transposed stimulus sets. Subjects' 
responses when presented with a transposition are known to be influenced by the 
extent to which the transposed stimuli differ from the training stimuli, and may also 
be influenced by the direction of the transformation (i.e., whether the transposition 
contains а stimulus or stimuli larger or smaller than those in the training set). Thus, 
in the first part of the experiment (2a) transpositions both ‘up’ and ' down’ were 
used but both transposed series were near transpositions, that is, close to the original 
training stimuli. In order to complete the experiment in part 25 both near and far 
transpositions were used. АП transposed series included the originally rewarded 
absolute stimulus. 
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METHOD 
Subjects 
Subjects were 49 children, mean age 4 yrs 4 mths, range, 3 yrs 6 mths to 5 yrs. 
Thirty-two of the children were subjects in Éxperiment 2a, and 17 in Experiment 25. 


Materials 

Five series of stimuli were taken from a population of 11 different lengths of 
cardboard strip 1 cm in width. Each strip maintained a constant ratio of 1 : 1:2 with 
its neighbour. The lengths of the stimuli from smallest to Jargest were: (1) 2-8 cm 
(2) 3-36 cm (3) 4-03 cm (4) 4-84 cm (5) 5:81 cm (6) 6:97 cm (7) 8-36 cm (8) 10:03 cm 
(9) 12-04 cm (10) 14-44 cm (11) 17-33 cm. Two series of eight stimuli were used in 
Experiment 2a, stimuli 1-8, and 4-11. "Three series of seven stimuli were used in 
Experiment 25, stimuli 1-7, 3-9, and 5-11. 


Procedure 

For Experiment 2a, 16 of 32 subjects were randomly assigned to a transposition 
upwards (IU) condition, and 16 to a transposition downwards (ТО) condition. 
Subjects in each condition were presented with stimulus pairs, the members of each 
pair being shown in both right and left positions. Presentation of the stimuli was as 
described for the absolute group of Experiment 1 and help was provided after 14 
trials. The TU group were trained to choose stimulus 5 from the ‘ small’ series 
(stimuli 1-8) until the criterion of nine correct choices in ten consecutive trials was 
reached. Subjects in the TD group were trained to choose stimulus 7 from the 
‘large’ series (stimuli 4-11) until criterion was reached. 


When criterion was reached a transposed series was presented to the subjects: 
the TU group received the ‘ large ' series; the TD group received the ‘ small’ series. 
All of the eight stimuli in the transposed series were simultaneously presented in the 
form of a circle, the position of each stimulus within the circular arrangement being 
randomly determined. Each child was asked to point to the previously reinforced 
stimulus (i.e., " can you show me the circle now? "). As will be described below, many 
children did not choose the ‘ correct’ stimulus. In order to ensure that they had not 
either forgotten what they had learned, or been disrupted by the simultaneous 
presentation of the transposition stimuli, 13 of the TU subjects were, following the 
transposition, given a simultaneous presentation of the original training set, and 
again asked to find the correct stimulus. 


In Experiment 25, 17 children were trained to choose stimulus 7 from the stimulus 
series 5 to 11. Paired presentation of the stimuli was as described above. When 
criterion was reached each subject was presented with three tests, each being separated 
by ten retraining trials with the original training series. The first two tests were 
transpositions: eight of the subjects were tested first with a series two steps removed 
along the size continuum (stimuli 3 to 9) followed by a series four steps removed 
(stimuli 1 to 7); the other nine subjects received the same presentations in reverse 
order. The third test consisted of a presentation of the original stimuli (stimuli 5-11). 
P us presentation of the stimuli, as described for Experiment 2a, was used 
in all tests. 


RESULTS 

Experiment 2a 

The subjects reached criterion during training in an average of 26 trials, making 
an average of 8-6 errors. The children's choices of stimulus on the transposition 
trials are given in Table 3 for both TD and TU subjects. None of the children 
particularly hesitated when making their choice or indicated that the correct stimulus 
was difficult to find. While this was the case it can be seen from the table that only 
five of the 32 children selected the stimulus that had been previously reinforced 


B 


276 Discrimination Learning 


TABLE 3 


SUBJECTS’ CHOICE oF STIMULUS ON TRANSPOSITION TRIALS, 
FOR TD AND TU Groups, EXPERIMENT 2A 


Unclassified Relative Absolute* 








Stimulus TD Jo 22 3 4 5 6 7 8 
number TU 11 10 9 8 7 6 5 4 

Subjects! TD 0 0 183 310 
choices TU 0 1| 14 4 240 
Totals 1 21 10 


* The stimulus in bold type was reinforced in the training trials. 


(stimulus 7 for TD, and stimulus 5 for the TU group). It is most unlikely that this 
pattern of responding was caused by any disruptive effects of the method of simulta- 
neous presentation itself: 12 of the 13 TU subjects who were given a simultaneous 
presentation of the training set selected, correctly, stimulus 5 (the other child chose 
stimulus 7). 


Each subject’s choice was assigned to one of three categories (see Table 3). 
(1) Absolute: if the choice was of the stimulus previously reinforced or one of the 
two adjacent stimuli. (2) Relative: choice of stimulus number 4 was considered a 
relative choice for the TD group, and number 8 for the TU group, as these stimuli 
maintain the same relative position within the transposed series (the fourth smallest, and 
the fourth largest, respectively) as did the correct absolute stimulus within the original 
training stimuli. The two adjacent stimuli were also classified as relative choices. 
(3) Any choice other than these six stimuli was assigned to an Unclassified category. 
The children did not make their choices on a random basis: had they done so the 
expected number of choices assigned to the Absolute, Relative and Unclassified 
categories (for the combined results of the TD and TU groups) would be 12:12:8, 
respectively: as can be seen from Table 3 the observed frequencies were 10 : 21 :1, 
a difference that is highly significant (y? = 13-2, df = 2, P<0-005). These findings 
indicate that when children are faced with this sort of transformation they will usually 
not select the stimulus that had previously been reinforced; rather, they are more 
likely to select as the correct stimulus one that maintains the same relative position 
within the stimulus set. 


Experiment 2b 

The subjects reached criterion in an average of 31 trials, making an average of 
10-3 errors. When given simultaneous presentation of the original training set (the 
third test given) all but three of the 17 subjects chose either the correct absolute 
stimulus (number 7) or one of the adjacent stimuli, which again indicates that this 
method of presentation is not disruptive. The children's choices in the two- and 
four-step transposition tests, assigned to Absolute, Relative, and Unclassified categories 
as described for Experiment 2a, are given in Table 4. Many of the children's first 
choices (26 of the 36 responses) could not be so classified as the stimulus chosen fell 
midway between an absolute and a relative response. For example, in the two-step 
test stimulus 6 lies between the absolute stimulus, number 7, and the relative stimulus, 
5. For such choices the child's second selection was used to classify the response. 
The data from the three-step transposition of Experiment 2a are also included in 
Table 4. 
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ТАВГЕ 4 


SUBJECTS’ CHOICE OF STIMULUS ON TRANSPOSITION TRIALS, 
EXPERIMENTS 2A AND 2B 


Unclassified Relative Absolute 








2-step test* 2 11 4 
3-step test** 1 21 10 
4-step test* 0 9 8 

Totals 3 41 22 


* Experiment 2 | ** Experiment 2a 


As can be seen from the table, children are more likely to make a relative choice 
than an absolute choice when faced with these types of transposition (y? — 5:14, 
df = 1, P«0-05). Previous researchers (e.g., Kuenne, 1946), using a different method- 
ology, have also found that children are more likely to make a relative choice with 
near transpositions. While the trend for the present results is not significant it is in 
the same direction: the percentage of children making a relative rather than absolute 
choice was 73, 68 and 53 for the two-, three-, and four-step transpositions, respectively. 


DISCUSSION 


The results from both Experiments 1 and 2 demonstrate that 4-year-old children 
can be trained to make what would usually be considered an absolute response. 
However, the children's verbal responses in Experiment 1, and their choices in the 
transposition tests of Experiment 2 strongly suggest that the children are basing their 
responding on the relational cues that are present at the time of training as well as, 
or even perhaps, instead of, the absolute cues. This finding is particularly striking in 
view of the fact that during training only an absolute response was reinforced, and the 
children at no time saw more than two stimuli at a time: they must, therefore, have 
been extracting the properties of the whole stimulus set from paired presentation of 
members of the set. The results from the first two experiments are compatible with 
the view that children of this age are simply not able to extract and use the absolute 
properties of stimuli, a position that has been forcefully expressed by Bryant (1974). 
То investigate this possibility further, Experiment 3 was carried out. 


EXPERIMENT 3 


In this experiment subjects were trained to criterion using presentations of sets of 
seven stimuli drawn from the whole stimulus set, rather than presentations of paired 
stimuli from the set. Four groups of children were used as subjects. The first group 
could solve the task using both absolute and relative cues; group 2 could use relative 
cues only; group 3 could use absolute cues only; group 4 could use neither relative 
nor absolute cues. 


If it is the case that young children are able to use relative, but not absolute cues, 
two predictions can be made: (1) groups 1 and 2 should take approximately the same 
number of trials to solve their respective problems; (2) groups 3 and 4 should find 
great (and equal) difficulty in solving their problems. 

METHOD 
Subjects 

The subjects were 31 children, mean age 4 yrs 6 months, range 3 yrs 7 months to 

4 yrs 10 months. 
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Materials and procedure 

The subjects were assigned randomly to the four conditions: eight each to groups 
1, 2 and 3, and seven to group 4. Subjects in each group were presented with three 
sets of seven stimuli, drawn from the population of 11 different lengths that were used 
in Experiment 2. The composition of the sets of stimuli differed between groups, and 
the ' correct" stimulus varied from set to set and from group to group as shown in 
Table 5. For group 1 the correct stimulus from set to set (number 6) was the same 
absolute size and maintained the same relationship (the middle stimulus) with the 
other members of the set. For group 2 the correct stimulus was a different size from 
set to set, but the same relationship (again, the middle one). For group 3 the correct 
stimulus was the same size but the relationship it bore to the other members of the set 
varied from set to set. For group 4 both the relative and absolute properties of the 
correct stimulus varied from set to set. 








TABLE 5 
STIMULUS SETS PRESENTED TO THE Four Groups OF SUBJECTS, EXPERIMENT 3 
Subject Cues 
group available Set 1 Set 2 Set 3 
1 Absolute and 3,4,5,6,7,8,9 2, 3, 5, 6, 7, 9, 10 1, 3, 5, 6, 7, 9, 11 
Relative 
2 Relative 
only 3,4,5,6,7,8,9 1, 2, 3, 4, 5, 6, 7 5, 6, 7, 8, 9, 10, 11 
3 Absolute 
only 3, 4, 5, 6, 7, 8,9 1, 2, 3, 4, 5, 6,7 5, 6, 7, 8, 9, 10, 11 
4 Neither 3,4,5,6,7,8,9 1,2,3,4,5,6,7 5, 6, 7, 8, 9, 10, 11 


The correct stimulus in each stimulus set is in bold type. 


Each child was told that he was going to be shown lots of cards one of which 
would have a circle on the back, and his task was to find the card bearing the circle. 
The experimenter then. presented the seven stimuli from one of the appropriate sets 
arranged in a circular manner on atable in front of the child. The child made his choice 
and the card was turned over. If the card bore a circle the child was praised. If it did 
not the child was asked to make another attempt and continued turning over the cards 
until the circle was found. Having found the circle the next set of stimuli was pre- 
sented and the procedure repeated. The three sets of stimuli appropriate for a 
particular child were presented in a predetermined random order. After each block 
of three trials the same sets were again presented in a random order, and the trials 
continued as necessary until the child reached criterion (i.e., nine correct first choices 
in ten successive presentations), or until 200 trials had elapsed, whichever was the 
earlier. Each training session lasted as long as interest could be maintained and 
the interval] between sessions was typically of the order of 24 hours. It is worth 
noting, in view of the procedure used in Experiments 1 and 2, that in the present 
experiment the children were never cued, or given hints, concerning the nature of the 
correct stimulus. 

RESULTS 


Two measures of performance were taken: the number of trials to reach criterion, 
and the number of attempts made on all the trials (a measure of errors) until criterion 
was reached. The two were highly correlated (rho = 0-98, P<0-001), thus only the 
number of attempts was further analysed. The findings for each of the groups are 
given in Table 6: it can be seen that all subjects in groups 1, 2 and 3, and no subject 
in group 4, reached criterion within 200 trials. An ANOVA indicated that highly 
significant differences existed between the four groups (F3, 28 — 32:87, P 0-001). 
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TABLE 6 
TRIALS TO CRITERION, AND NUMBER OF ATTEMPTS, 
EXPERIMENT 3 


Trials to criterion Attempts made 








Subject н 

group Mean SD Mean SD 
1 52.9 324 160-2 109-0 
2 150-5 29-8 493.9 146.9 
3 89.5 41:6 304-0 165-0 
4 200+ — 788-7 49-4 


Subsequent analyses, using t-tests showed that all paired comparisons were significant 
(Gp 1 with 2, = 4-83, P« 0-002, two-tailed; 1 with 3, = 1:92, P «0-05, one-tailed; 
2 with 3, 1 = 2-27, P «0-05, two-tailed; 1, 2, and 3 with 4, all Ps «0-001, two-tailed). 
As can be seen from Table 6, however, the results are not in the direction predicted: 
learning was quickest when both absolute and relative cues were available (Group 1); 
when absolute cues only were available (Group 3) learning was quicker than when 
only relative cues were available (Group 2); when neither cue was available (Group 4) 
learning did not occur. 


DISCUSSION 


The three experiments described here have produced somewhat conflicting 
results. "Those from the first two experiments confirm and extend previous findings 
suggesting a preference for relative coding in young children, and difficulty in respond- 
ing to the absolute features of discriminatory stimuli. In Experiment 1 all three age 
groups learned a relative task more easily and quickly than an absolute task. When 
the 4-year-old children were specifically cued to make an absolute response (the 
Absolute Group of Experiment 1, and all children in Experiments 2a and 25) it was 
clearly the case that they were detecting and responding to relative cues, as indicated 
by their verbal responses to the question “ How could you tell which was the correct 
card?” (Experiment 1), and also by their choice of stimulus on the transposition 
tests of Experiments 2a and 25. The results of these two experiments can be inter- 
preted as supporting Bryant's (1974) view that young children depend primarily on 
relative codes and have great difficulty when they are required to register absolute 
values. 


However, the findings from Experiment 3 are in the opposite direction and 
necessitate the conclusion that young children can and do respond to the absolute 
properties of stimuli. The condition in which absolute features were held constant 
while relative features varied (Group 3) produced quicker learning than the converse 
condition, Group 2, where relative features were held constant from stimulus set to 
stimulus set, and absolute features were varied. Where neither cue was available 
(Group 4) learning did not occur: thus, the fact that children learned in Groups 2 
and 3 points to the conclusion that they were able to detect and respond to both types 
ofcue. This conclusion is strengthened, of course, by the finding that the most rapid 
learning occurred in Group 1 where both cues were available. 


Clearly, the results from the three experiments are not easy to reconcile: the 
first two produced little absolute responding, the third a lot. It may be possible to 
reconcile the differences by a consideration of the different methodologies of the 
experiments. The mode of stimulus presentation of Experiment 3 differs in two 
major respects from that of Experiments | and 2. First, multi-stimulus, rather than 
paired, presentation was used. Second, the correct (і.е., rewarded) stimulus in the 
relative group (Group 2) of Experiment 3 occupied the intermediate position in the 
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stimulus set, rather than being the largest or the smallest. In order to investigate 
more fully the effects these differences have on relative and absolute responding the 
final experiment was carried out. 


EXPERIMENT 4 


Forty-eight 5-year-old children (mean age 5 yrs 6 mths, range 4 yrs 11 mths to 
6 yrs 1 mth) were trained to criterion on discrimination learning tasks, using a 3x2 
design. There were three response conditions: in the first the children were rewarded 
for an absolute response; in the second, for responding to the largest stimulus in each 
presentation (the relative-larger group); and in the third, for responding to the 
intermediate stimulus in each presentation (the relative-middle group). Two different 
stimulus set sizes were used: in one the subjects were presented with one of three sets 
of three stimuli per trial, drawn from a population of seven stimuli; in the other the 
subjects were presented with one of three sets of seven stimuli per trial, drawn from a 
population of eleven stimuli. For both stimulus conditions the three stimulus sets 
were successively presented, one per trial, in a randomly determined order, this being 
followed by another random ordering, and so on. Half the children in each response 
condition were allocated to each stimulus condition: thus, there were six conditions 
in all. The subjects were randomly allocated with the constraint that there were 
eight children in each condition. 


The stimuli were drawn from the sets described earlier. They were presented in 
a manner similar to that described under Experiment 3, and the trials continued until 
the child reached criterion (i.e., nine correct choices in ten successive presentations), 
or until 120 trials had elapsed. 
RESULTS 


The mean number of trials on which errors were made, for the different conditions, 
are given in Table 7. А 3x2 АМОУА indicated that, overall, the three-stimulus 
group learned more easily than the seven-stimulus group (F1, 42 = 37-71, P « 0-001), 


TABLE 7 
NUMBER OF TRIALS ON WHICH ERRORS WERE MADE, EXPERIMENT 4 


Response condition 





Stimulus Absolute Relative-larger  Relative-middle 

condition ——— — m е —— 
Mean SD Mean SD Mean SD 

3 stimuli 11-75 13:23 1:38 0-99 405 1567 

7 stimuli 265 261 4-88 3:95 101.38 6:61 


and there was a highly significant effect of response conditions (F2, 42 = 90:83, 
P«0-001) Post hoc Sheffe tests showed that the relative-larger condition was easier 
than the absolute condition (F 2, 42 = 9-25, P 0-025) and that the absolute condition, 
in turn, was easier than the relative-middle condition (F2, 42 = 97-01, P « 0-001). 
This pattern of results enables interpretation of the conflicting results obtained from 
the earlier experiments, and these are now discussed. 


DISCUSSION 


The majority of studies investigating the absolute/relative question have used 
paired presentation of the stimuli. This was also the method used in Experiments 1 
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and 2; while the total stimulus set was large the children saw only two stimuli at a 
time. It seems reasonable to suppose that if a subject is presented with two stimuli 
that differ in some obvious manner (i.e., one is larger than the other) he will naturally 
be aware of this difference, and when asked to make a discrimination between the 
stimuli will attempt, at least initially, to base his response on the basis of such relative 
cues. After all, in such a situation, even when an absolute response is rewarded, the 
correct stimulus of any pair will either be the largest or the smallest. In addition, 
verbal labels are more readily available for simple relations of ‘ larger’ or ‘ smaller’ 
than they are for specific lengths. In Experiment 4 it was again found to be the case 
that responding to a simple relational cue (the largest) produced the quickest learning. 
In this experiment the only effect of multi-stimulus presentation was to increase the 
difficulty of all tasks as the number of stimuli increased (from three to seven per trial). 


It is known (Zeiler, 1967; Reese, 1968) that the transposition response (re- 
sponding to relative cues) is found less frequently to test trials with intermediate 
problems. In Experiments 3 and 4 a relational response to the intermediate member 
of the stimulus set was more difficult to elicit than an absolute response. Clearly, 
the ease with which a relational response is learned will depend upon the type of 
response required (larger/smaller versus intermediate). 


An interesting finding to emerge from the present studies is that young children 
can learn to respond to the absolute properties of stimuli (Experiments 2, 3 and 4), 
and that this mode of response is consistently more easy to elicit than an intermediate 
relative response (Experiments 3 and 4). The findings from Experiment 2 that subjects 
rewarded for responding on an absolute basis were also detecting relational cues, and 
from Experiment 3 that the most rapid learning occurred in the condition where both 
relative and absolute cues were available, support the view expressed by Lane and 
Rabinowitz (1977) that '*. . . the subject learns about both the absolute and relational 
aspects of stimuli during training " (р. 413). 


Bornstein (1975) has demonstrated that infants discriminate on an absolute 
basis between stimuli that differ in hue. The present experiments have shown that 
young children are able to detect and respond to both the absolute and relative 
properties of stimuli that differ in size. Our findings indicate that the widely accepted 
view that young children have only relative cues available to them is incorrect. We 
suggest that the preferred mode of responding is primarily determined, not by age, 
but by the characteristics of the experimental design and by the nature of the absolute 
or relative response required. 
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LANGUAGE ACQUISITION AND COGNITIVE 
DEVELOPMENT IN THE ACQUISITION OF KINSHIP 
TERMS 


By ANN MACASKILL 
(Department of Psychology, University of Aberdeen) 


SUMMARY. This paper presents normative data on the comprehension of kin terms by 
children, which is then examined in terms of the Piagetian hypothesis that language 
acquisition is dependent on cognitive development in the child. The order in which 
kin terms were comprehended followed a definite pattern in terms of the sorts of 
relationships which were involved, the number of relational components in the terms 
and the cognitive demands that the comprehension of these relationships made on the 
child. These data seemed to support the Piagetian position, and particular cognitive 
ораса which appeared to relate causally to the development of kin terms were 
discuss 


INTRODUCTION 


THIS paper presents some data in support of the Piagetian hypothesis that language 
acquisition is dependent on cognitive development in the child. In this approach, 
the existence of innate language acquisition devices in the child, as postulated by 
Chomsky (1957, 1965, 1966, 1976) and McNeill (1966) to explain language develop- 
ment, is rejected and Piaget and Inhelder (1969) argue instead that cognitive structures 
which are built up during the sensori-motor period through interaction with the 
environment, precede language and are pre-requisites for language development. 


First of all, it is suggested that language does not occur until the end of the 
sensori-motor period as it is not until then that the child can conceive of himself as a 
person distinct from the objects he acts upon. This realisation allows him to differ- 
entiate himself from others and hence makes it possible to have communication with 
others. One of the correspondences which is thought to occur between sensori-motor 
schemata and language abilities involves the development of the child's ability to 
order things temporally or spatially, which corresponds to the concatenation of 
linguistic elements which occurs in language development. There is also the develop- 
ment of classification of objects by his actions towards them, such as, when he does 
the same thing with a whole category of objects or applies a whole category of action 
schemata to the one object. This has as its linguistic counterpart, the categorisation 
of language into major elements like noun phrase and verb phrase. The linguistic 
recursive ability of embedding phrase markers within other phrase markers is thought 
to derive from the child's ability to embed action schemata into one another. 

Language acquisition cannot begin, according to Piagetians, until the operations 
of the sensori-motor period, some of which have just been described, have been 
acquired by the child. Thus language is seen to build on and further develop a 
number of cognitive abilities that have already risen during the sensori-motor period. 
This process is thought to continue throughout development so that for the child to 
acquire new linguistic forms he must have mastered the cognitive concepts which 
underlie these linguistic forms. 

The study reported in this paper was designed to collect normative data on one 
aspect of language acquisition in children which could then be examined in terms of 
this approach. The area of language selected for study was kinship terms. Kinship 
terms seemed ideally suited to this type of theoretical approach for several reasons. 
First of all the conceptual field of kinship terms represents a fairly compact, well- 
defined set of words which, while being small enough to handle easily, also varies 
considerably in conceptual complexity. Kin terms are also extremely interesting 
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concepts in that as well as having perceptually given properties they also have attri- 
butes which have no perceptual correlates, namely their relational attributes. When 
fully developed, kin terms are fairly abstract concepts in that they do not always refer 
to the same people and can refer to different people depending on who is speaking 
and, to add to the complexity, the same people can fulfil different kinship roles for 
different people so that the child has to become aware of the essential variability of 
kinship roles, changing as they do depending on which relatives are being addressed 
and by whom. 


The interpretation of meaning adopted here is that described by Vygotsky (1962), 
where the development of word meaning is equated with conceptual development. 
Words express concepts and hence by studying how words acquire meaning for the 
child, we are also looking at the development of the child's concepts. Vygotsky 
pointed out that words such as table, for example, do not refer to single objects but 
to groups or classes of objects so that words are therefore generalisations and refer 
to concepts. Thus by examining the acquisition of kinship terms in the child we are 
studying the development of his kinship concepts and here we are interested in the 
cognitive demands that the conceptual structure of particular terms make on the child. 


METHOD 
Acquisition study 
Subjects. The subjects were 320 children, 160 boys and 160 girls, ranging in age 
from 3 years to 11:4 years. These children were selected from the seven forms of a 
primary school and the nursery class of the same school, with mean ages of 4 : 0-5, 
5:1:5,6:1:5,7:1, 8:2-5,9:2, 10: 1-5, 11:1 respectively. There were 40 children in 
each group, half males and half females. 


Procedure. Аз the experimenter was interested in how much knowledge of kin 
terms the child actually had and not the degree of difficulty he had in expressing this 
information verbally, it was decided to look at the comprehension of kin terms by 
children. А sentence completion test was devised for this purpose. This consisted 
of a definition for а kin term with the word for the term being supplied by the child. 
After experimenting with both formal and more personal styles of definitions it was 
found that the latter was comprehended at a much younger age and also that this was 
the style of language used spontaneously by most children and a majority of adults 
when discussing kinship terms. See Appendix I for the sentence completion test. 


Subjects were tested individually in an empty classroom. Initially all the subjects 
were told that they were to be asked about the various people who made up a family. 
The experimenter would read them a sentence which described a family member, but 
the name the person was called was missing and the child was required to supply this 
word at the end of the sentence. Subjects were given the following example: " The 
people you live with are your FAMILY ”. 

This served both as a practice trial and to induce mental set for the rest of the 
questions. It also allowed the experimenter to check that the younger subjects 
оао the term and further explanations were given to younger subjects if 
required. 

Each definition was then slowly read out to each child and the child's responses 
recorded. The presentation order was randomised for each child. 


RESULTS 
To calculate norms for the ages at which children are able to comprehend various 
kin terms the normal Piagetian convention was followed. This involved reporting the 
age levels at which 75 per cent of subjects successfully completed particular terms. 
These results are shown on Table 1. 
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TABLE 1 
KIN TERMS SUCCESSFULLY COMPLETED AT EACH AGE LEVEL 


Class Mean age Kin terms successfully completed 


РІ 5:15 Mother, Father, Grandmother 
P2 6:15 Grandfather, Sister, Brother 
P4 8:25 Aunt, Uncle, Cousin 

P5 9:2 Daughter, Son 

P6 10:15 Granddaughter, Grandson 

P7 11:1 Niece, Nephew 


It is apparent from the data that the comprehension of kin terms improves with 
the age of the child, and that kin terms have differential rates of development and 
consequently criterion levels are reached at different ages for different terms. 


Underlying patterns apparent in the data 

Child-Centred relationships| Other-Centred relationships: An analysis of the order 
in which terms are first comprehended by children found that an underlying pattern 
in terms of the types of relationships involved in each term was apparent. First of 
all, the terms could be divided into two groups, those which defined other people's 
relationship to the child, which will be called Child-Centred relationships, and the 
terms which defined the child's relationship to others, that is Other-Centred relation- 
ships. This gave the following division of terms: 


Child-Centred relationships: mother, father, grandmother, grandfather, sister, 
brother, aunt, uncle, cousin. 


Other-Centred relationships: daughter, son, granddaughter, grandson, niece, 
nephew. 


The terms sister, brother and cousin were included in the Child-Centred category 
of relationships as it was apparent, both from the original study of the terms brother 
and sister by Piaget (1928) and the Elkind (1962) replication of this experiment, that 
at the age at which these terms are acquired the child still tends to think of a sibling 
or a cousin as something that he has rather than as something he is. On questioning 
he will admit to being a brother but his spontaneous response is in terms of having а 
brother and this seems to be true even of adult subjects. 


Componential analysis of kin terms 

Next the individual terms in their acquisition order were studied in detail, using 
the method of relational components proposed by Bierwisch (1970) and applied to kin 
terms by Haviland and Clark (1974). This is a form of componential analysis which 
in addition to sex and age components has semantic components representing relations 
between two or perhaps more terms. A relational component is defined as being “а 
representation of the relationship between two or more entities " (Bierwisch, 1970, 
p. 172). Bierwisch defines the central relational component within the kinship 
system as being (X parent of Y), with its inverse (Y child of X). These relational 
components are combined with property features like (male X) and (female X) to 
give lexical entries for kin terms such as, mother: (X parent of Y) (female X). Within 
this system X and Y are variables which will change depending on whose relatives 
are being discussed. There are two redundancy rules incorporated into the system 
which apply to all kinship terms, namely that all entries are animate and also that the 
relational component (X parent of Y) carries the restriction (adult X). Dummy 
entities are introduced to allow for intermediate relationships which are necessary to 
account for the relationship of X to Y. For example, the entry for grandmother is 
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(X parent of À) (A child of Y) (female X). Here the relationship must involve the 
third person А. The letter X always refers to the person holding the relationship 
which is being defined and hence the property component, usually male or female, 
will apply to X also. 


Applying this system gives the following descriptions of kin terms: 


Mother: (X parent of Y) (Female X) 

Father: (X parent of Y) (Male X) 

Grandmother: (X parent of А) (А parent of Y) (Female X) 
Grandfather: (X parent of А) (А parent of Y) (Male X) 

Sister: (X child of A) (A parent of Y) (Female X) 

Brother: (X child of A) (A parent of Y) (Male X) 

Aunt: (X child of A) (A parent of Z) (Z parent of Y) (Female X) 
Uncle: (X child of А) (A parent of Z) (Z parent of Y) (Male X) 
Cousin: (X child of А) (A child of 2) (2 parent of В) (B parent of Y) 
Daughter: (X child of Y) (Female X) 

Son: (X child of Y) (Male X) 

Granddaughter: (X child of A) (A child of Y) (Female X) 
Grandson: (X child of À) (А child of Y) (Male X) 

Niece: (X child of A) (A child of Z) (Z parent of Y) (Female X) 
Nephew: (X child of A) (A child of Z) (Z parent of Y) (Male X) 


As the sibling relationship is important for specifying terms it can be written in 
an abbreviated form which then simplifies some of the entries. For example the term 
Aunt, then becomes (X sib A) (А parent of Y) (Female X). 


It became apparent in the study that it is the relational components of kin terms 
which present the greatest difficulty to children, the property features such as sex 
being visually given and hence acquired relatively easily. For this reason it was 
decided to concentrate on explaining the acquisition of the relational components of 
kin terms. Looking at the relational components of kin terms as they are described 
by the Bierwisch system showed that there were certain regularities present in the 
acquisition data in terms of the types of relationships existing between X and Y, and 
the order in which they are acquired. 


Child-Centred relationships 

The terms acquired first are mother and father which involve only one relational 
component, X parent of Y and this is shown on Figure 1. This 13 an asymmetric 
relationship involving one component. 


Next the terms grandmother and grandfather, (X. parent of А) (A parent of Y), 
are acquired. Here again as shown on Figure 2, the relationships are asymmetrical 
and the relational component is repeated. This is known in linguistics as recursion. 
The type of relationship involved in the terms grandmother and grandfather can be 
described as being asymmetrical with one relational component plus recursion. 


The terms sister and brother, (X child of A) (A parent of Y), are acquired next. 
Here there are two relational components required to describe the kin relationship 
and it can be seen from Figure 3 that the relationship between X and Y is a symmetric 
one, although it does depend on the prior understanding of the asymmetric relations 
© parent of’ and ‘ child of’. 


Acquired next are the terms aunt and uncle, (X child of A) (A parent of Z) (Z 
parent of Y), which, using the information which children have acquired about the 
sibling relationship, simplifies to (Х sibling A) (A parent of У). Thus aunt and uncle 
can be seen to involve two relational components, one basic component and one higher 
order component. From Figure 4 it can be seen that the relationship of X to Y is an 
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FIGURES 1-5 
CHILD-CENTRED RELATIONSHIPS 
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asymmetrical one but it also requires prior understanding of the symmetrical sibling 
relationship X to A, and the asymmetrical relationship А to Y. Thus the kin terms 
aunt and uncle describe an asymmetrical relationship which is dependent on under- 
standing of both asymmetric and symmetric relationships and involves two relational 
components, one of which is a higher order component, * sibling of °. 


Similarly the term cousin, (X child of А) (A child of Z) (Z parent of B) (B parent 
of Y) can be simplified to (X child of А) (A sibling of B) (B parent of Y). The term 
cousin is slightly more complex involving three relational components, one of which 
is a higher order component. From Figure 5 it can be seen that the relationship of 
X to Y is a symmetrical one as is the relationship between А and B on which it 
depends, while the other two relationships are asymmetrical. The term cousin can 
thus be described as involving three relational components one of which is a higher 
order component, ‘ sibling of’, and it is a symmetrical relationship which requires 
understanding of both asymmetrical and symmetrical relationships. Apart from the 
extra relational component in the term cousin, it is very similar to the terms brother 
and sister. 


To summarise, the acquisition order of kin terms has been described in terms of 
the relational components involved in each kin term, both the nature of the relational 
components and their number being important. This analysis has given the following 
descriptions of kin terms in the order in which they are acquired by the child: 


(1) Asymmetrical relationship with one relational component. 

(2) Asymmetrical relationship with one relational component plus recursion. 

(3) Symmetrical relationship dependent on the understanding of asymmetric 
relationships, with two relational components. 

(4) Asymmetrical relationship dependent on the understanding of both sym- 
metric and asymmetric relationships with two relational components, one of 
which is a higher order component. 

(5) Symmetrical relationship dependent on the understanding of both symmetric 
and asymmetric relationships, with three relational components, one of 
which is а higher order component. 


The kin terms described here can all be described as being Child-Centred relation- 
ships in terms of the distinction made earlier and these are the terms which the child 
first comprehends. Within this group we have seen that, in terms of the types of 
relationships and the number of relational components involved in each relationship, 
the terms are acquired in an ordered pattern. 
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Other-Centred relationships 

When the Other-Centred relationships, that is kin terms which describe the 
relationship which the child himself holds to others, are examined, a similar regular 
acquisition order in terms of the nature and number of relational components involved 
in each kin term is apparent. 

Acquired first are the terms daughter and son, (X child of Y). This is an asym- 
metric relationship involving one relational component as shown in Figure 6. 

Next the terms granddaughter and grandson are acquired, (X child of A) (A child 
of Y). From Figure 7 it is apparent that this again is an asymmetrical relationship 
involving one relational component plus recursion. 


FIGURES 6-8 
OrHER-CENTRED RELATIONSHIPS 
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Finallyithe terms niece and nephew, (X child of A) (A child of Z) (Z parent of Y), 
which, using the knowledge the child already has about the sibling relationship, 
become (X child of A) (A sibling of Y). This is an asymmetric relationship as shown 
in Figure 8, but it is dependent on the child's understanding the symmetric relationship 
* child оғ”. Niece and nephew can be described as asymmetric relationships depending 
on the understanding of both asymmetric and symmetric relationships and involving 
two relational components, one of which is the higher order component ‘ sibling of’. 


DISCUSSION 


An orderly sequence of development of knowledge of kin terms was apparent. 
This suggested that the acquisition of particular kin terms could perhaps be dependent 
on particular cognitive skills having been acquired by the child as suggested by Piaget 
and Inhelder (1969), 

Terms which described the relationship which others held to the child, Child- 
Centred terms, were acquired before terms which described the relationships which 
the child himself held for others, Other-Centred terms. Piaget (1956) described 
egocentricity in the young child as consisting of a “ general incognizance of the 
notion of points of view and hence a lack of awareness of how the child’s own point 
of view may differ from other people's " (р. 196). The young child is, as Piaget put 
it, " the unwitting centre of his own universe ". Using this Piagetian conception of 
egocentricity, it would seem that the young egocentric child would find it impossible 
to define Other-Centred relationships as these require him to take the role perspective 
of the other in order to define the relationship which he holds for that person. "This 
ability to define Other-Centred kin relationships would thus depend on the decline of 
egocentricity in the child as he develops. 
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Terms involving asymmetrical relationships were acquired before those involving 
symmetrical relationships and terms with one relational component, as defined by 
Bierwisch (1970), were acquired before those with two or three components. The 
terms which were acquired later involved relationships which the child had acquired 
earlier, so that for example the term mother was comprehended before the term 
grandmother as understanding the latter depends on the former. Similarly the term 
sister appeared before the term aunt. Throughout the acquisition order observed, 
this pattern appears where children first acquire one relationship and then a more 
complex one whose comprehension depends on an understanding of the first relation- 
ship. 

Given this orderly pattern of development of understanding of kin terms, it 
seemed likely that the development of other cognitive operations might relate causally 
to the development of kin terms. Such cognitive skills as the ability to decentre, to 
handle series, to understand reciprocity, to reason logically and to handle abstract 
concepts all seem necessary skills in the comprehension of the complete set of kinship 
terms. The experimenter has, in later studies (Macaskill, in preparation), attempted to 
match different kin terms with Piagetian tasks involving the same underlying cognitive 
abilities. 


Thus the data presented here appear to lend considerable support to the Piagetian 
hypothesis that language acquisition is dependent on cognitive development. Kinship 
terms are acquired in an orderly pattern based on the conceptual or semantic com- 
plexity of particular terms, with the simplest terms being acquired first. It is a 
relatively long process reflecting the number and complexity of the cognitive skills 
which the child has first to acquire, and it is hoped that later studies will shed more 
light on the nature of this process. 

NOTE: Reprints available from the author at 52 Linden Avenue, Sheffield 8. 
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APPENDIX I 
Sentence Completion Test. Each definition was typed on a separate card. 
The woman who had you as a baby is your ———— ———- (Mother). 
The man who had you as а baby is your —-——————— (Father). 
A girl/boy with the same parents as you is your ————— ——— (Sister/ Brother). 
You are your mother's —-——————— (Son or Daughter). . 
And a boy/girl would be her — — ————. 
You are your father's -----—— (Son or Daughter). 
And a boy/girl would be hus ——— — ——. » 
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The mother of your mother/father is your ————————— (Grandmother). 

The father of your mother/father is your ——— —— — — (Grandfather). 

(Note: Аз subjects seemed to experience difficulty with the double possessive, the questions for 
granddaughter and grandson were preceded with the following question in its relevant form to 
ensure that the child was focusing on the appropriate person: “Наѕ your mother got a mother, 
or has your father got a father ?”) 


You are your mother’s mother's ——————— (Granddaughter ог Grandson). 
And a girl/boy would be her й 

You are your father's father's 

And a boy/girl would be his — ——. 

Similarly for mother's father's and father's mother's. 

Your mother's/father's sister is your ———————— (Aunt). 

Your mother's/father's brother 18 your ———————- (Uncle). 

The children of your mother's/father's sisters are your ——————— (Cousin). 
The children of your mother's/father's brothers are your ———————- (Cousin). 
You are your mother's/father's sister's —————— — (Niece/Nephew). 

And a boy/girl would be her/his —————— —. 


You are your mother’s/father’s brother's 
And a boy/girl would be her/his 
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SEMANTIC CONTEXT AND GRAPHIC PROCESSING 
IN THE ACQUISITION OF READING 


Ву G. B. THOMPSON 
(Department of Education, Victoria University of Wellington, New Zealand) 


SUMMARY. Two experiments provided tests of predictions about children's use of seman- 
tic contextual information in reading, under conditions of minimal experience with 
graphic processing. The predictions were from the theory of Smith (1978) and from 
an extension of the La Berge and Samuels account of attentional limitations. In 
Experiment 1, 24 children of age 61 years read, orally, passages of continuous text 
with normal and with low semantic constraints, under conditions (lower case, upper 
case, mixed case) in which the subjects had different degrees of experience with the 
graphic processing. In Experiment 2, 48 children of ages 8 and 11 years read the same 
passages under a different set of graphic conditions which included cursive handwriting. 
The results were not consistent with the predictions from either theoretical account. 
The results were discussed also in relation to the account of Morton (1964, 1969, 1979). 


INTRODUCTION 


IN the reading task the sources of information available for word identification are of 
two types, graphic and contextual. The first source is the visual input from the 
written or printed word of the text. The second source is the semantic (and syntactic) 
contextual information obtained by the reader from previously identified segments of 
the text. There is a general view, of some currency, that graphic information is of 
major importance for word identification by the inexperienced reader while contextual 
information is of little or no importance; however, for the experienced reader the 
relative importance is said to be reversed, contextual information becoming more 
important than graphic information (Singer, 1970, р. 150; Spragins et al., 1976; Shuy, 
1977). 

However, these views are lacking in specificity and give no account of reasons 
for the purported changes in the reader's processing of information, as the reader 
gains experience. The theory of Smith (1971, 1978) is an account which does give 
some specification of processing changes which are purported to occur as the reader 
gains experience. Limitations of short-term memory are claimed to be a critical 
constraint on the reader's use of semantic contextual information (Smith, 1978, p. 39). 
With acquisition of experience in reading the child will often be able to circumvent 
these limitations by use of ‘ immediate meaning identification ’, a process of identifying 
units of meaning rather than individual words. This process always results in relatively 
fast reading. But if the child is reading at a slow rate, in the vicinity of 1 second per 
word or slower, the child will be able to make very little use of semantic relationships 
between words. This is claimed to apply both to the generally inexperienced reader 
and to the experienced reader attempting difficult reading material (Smith, 1971, 
р. 208). It is of interest to note the belief that the inexperienced reader is not able to 
use contextual information because of temporal limitations of memory has persisted 
for many years (Bloomfield, 1942, p. 183). 


Another theory which specifies processing changes as the reader acquires ex- 
perience was cited by Schvaneveldt et al. (1977) and is based on the La Berge and 
Samuels (1974) account of attentional limitations. According to the Schvaneveldt 
et al. extension of this account, the reader who is not skilled nor well practised in 
processing the graphic information would be giving most available attention to this 
visual processing and little attention would be available for using semantic context 
to facilitate word identification. It is of interest that this account is similar to that 
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given many years ago by Bryan and Harter (1899, p. 357) for auditory processing 
during the acquisition of telegraph receiving skills. 


'The theory of Smith, and the Schvaneveldt et al. extension of the La Berge and 
Samuels account, thus provide predictions about the use of contextual information 
for word identification by the reader under conditions of minimal experience with 
the graphic processing. The two predictions are similar, although that from the 
Smith account is expressed in specific temporal terms. The predictions are: 


(1) From the Smith account, if the child is reading at a slow rate, in the vicinity 
of 1 second per word or slower, he will be able to make little, if any, use of 
semantic context for word identification. The child reading at a faster rate 
would be expected to make much more use of semantic context. 

(2) From the extension of the La Berge and Samuels account, if the child is not 
well practised in processing the graphic information, he will be able to make 
little, if any, use of semantic context for word identification. The child 
who is well practised in processing the graphic information would be expected 
to make much more use of semantic context. 


It is the purpose of this study to test these predictions as they apply to conditions 
representative of the reading of continuous text, and of a reader making his own 
temporal organisation of responses (not modified by the experimenter, as in brief 
tachistoscopic exposures). There is little existing evidence to test the predictions 
under these conditions. One study by Scheerer-Neumann (1979), although directed 
toward a somewhat different research question, does provide data which appear 
relevant. Scheerer-Neumann studied high and low progress readers in the third and 
fourth year of schooling and obtained reaction times to single words exposed subse- 
quent to the presentation of a sentence context. Contexts were either semantically 
congruent or incongruent with the target words. The mean difference in word 
identification response times between these two levels of contextual constraint was 
at least as great with the low progress as the high progress readers. As the response 
times were in the vicinity of 1 second the results are apparently inconsistent with the 
first prediction. On the assumption that the low progress group were not as well 
practised in graphic processing as the high progress group, the data would be incon- 
sistent with the second prediction. However, it is doubtful whether or not the demands 
of either memory or attention would be as great for an identification response to a 
single target word in a sentence as in the case of many closely successive responses 
required in reading a continuous passage. 

In the present study an attempt is made to test the two predictions by using 
performance measures based on all word identification responses in passages of 
continuous text, and by manipulating conditions to represent different levels of 
practice in processing the graphic information, as well as manipulating semantic 
contextual constraints of the passages. In Experiment 1 children from first grade 
read, orally, passages of continuous text with normal semantic contextual constraints 
and passages with low semantic constraint, each under conditions representing 
different degrees of practice in graphic processing. 


EXPERIMENT 1 

Subjects 

The subjects were 24 children randomly selected by sex and age from pupils in 
the seven first-grade classrooms of two state schools in Melbourne, Australia. The 
mean age of the children was 6:10 years (range 6:7 to 7:1). There was an equal 
number of girls and boys. All were English-speaking Caucasian children. Occupa- 
tional groups most strongly represented by the families of the pupils of the two schools 
were craftsmen, process workers, and clerical workers. Professionals, managerial 
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Workers and labourers were not so strongly represented. (Occupational classifications 
according to Broom and Jones, 1976, pp. 121-124.) 


The Wechsler Intelligence Scale (WISC-R) Vocabulary test mean scale score for 
the children was 8:8 with standard deviation 2:9 (US norms: M = 10:0, SD = 3-0). 
The Wide Range Achievement Test (WRAT) revised edition, Reading Level 1 was 
also administered and the mean standard score was 116:6 with standard deviation 
12-6 (US norms: М = 100-0, SD = 15:0). 


The experimental tasks and the standardised tests were administered within two 
months of the end of the school year, at which time the children had received at least 
12 months of school reading instruction. This instruction commenced in the schools 
during the first year, using in part a ‘ language experience’ approach, but also direct 
teaching of letter-sound correspondences. During the first grade, second year at 
school, teaching was mainly through concurrent use of a variety of school reading 
series, largely of British and Australian origin. А selection of at least four such 
series were used during first grade in each of the classrooms. 


Materials 

Six school reading book series were selected from the total of 12 series in use in 
the classrooms. Each of the semantically normal passages was an excerpt from one 
of the six series. The excerpt was selected to have vocabulary within the reading 
capability of the 6i-year-olds and to be amenable to the manipulations required to 
form passages of low semantic constraint. The following is one of the semantically 
normal passages: 


“ Sit down Jack! 

The boat will tip," said Father. 
Splash! Jack fell into the water. 

** Here," said Father, “ Take this rod." 
Jack took the rod. Father pulled him 
to the boat. 

Jill and Father helped Jack 

out of the water. 


(In order to maximise uniformity in each child's reading response to proper names, 
each proper name of the original excerpt was substituted by one from a set of four 
very common names.) 


The following is the corresponding example of a passage of low semantic con- 
textual constraint: 


** Sit down water! 

Father will tip," said the rod. 

Father! Jack fell into the rod. 

* Here,” said the boat. “ Take this water." 
Jack took Father. Jack pulled him 

to the splash. 

Jill and the boat helped Father 

out of Jack. 


These passages were constructed from the semantically normal passages by inter- 
changing some words within the passage. In the example the words interchanged are 
in italics. (There were no such italics in the passages presented to the subjects.) 
The interchange of words was carried out in such a way as to make the passages as 
semantically implausible as possible, within the criteria that (a) the resulting passage 
was syntactically acceptable, (b) the words interchanged were either nouns, pronouns, 


294 Semantic Context т Reading 


or nouns in the company of an article or adjective, and (c) the constructed version 
comprised the same words as the normal passage from which it was derived. То 
provide a check on conformity to these criteria, preliminary versions were submitted 
for independent judgments by three faculty staff, and modifications made accordingly. 


Three different graphic conditions were produced of each normal and each low 
semantic constraint version of the six excerpts, making a total of 36 (i.e., 3x 2x6) 
experimental passages. The three graphic conditions were: lower case, upper case, 
and mixed (upper/lower) case. The subjects would have had little or no previous 
practice in graphic processing under the second and third conditions but much more 
practice under the first condition. The lower case condition was as normally used in 
printing, with initial capitalisation only of words commencing a sentence and of 
proper names. In the upper case condition all letters were capitals. In the mixed 
case condition the initial letter of each word was in lower case and all other letters 
in upper case, e.g., JUMP. However, the initial letter of the first word of each 
sentence, and of all proper names, was in upper case. Also, ‘I’ (as in ‘I aM’) was 
retained in upper case form. 


The passages were typed with a sanserif style of type face, size approximately 
20 points. Unit character spacing was used with six characters perinch. The passages 
presented to the subjects were photocopy reproductions of these typescripts. 


Design 

The design was based on a 6x6 Latin square in which the six treatment con- 
ditions (2 semanticx 3 graphic conditions) were combined with the six excerpts. 
Each child was administered all six treatment conditions with one each of the six 
excerpts represented under each condition. There were four such Latin squares, one 
for boys and one for girls in each of the two schools. Schools were treated as a 
replication factor. (However, excerpts were not replicated, the same excerpts being 
presented in both schools.) Children were assigned at random to the rows of each 
Latin square, six boys and six girls in each of the two schools. 


The factors of treatment conditions (semantic constraint and graphic conditions) 
and sex were regarded as fixed factors, while excerpts, schools, and subjects were 
taken as random factors, in accord with the arguments of Coleman (1964) and Clark 
(1973). Because of the confounding of subjects and excerpts in the Latin squares, 
the presence of several random variables in the design did not cause complications in 
deriving appropriate F ratios from the expected values of the mean squares. The 
design provided for the variance due to the six treatment conditions to be divided 
into two orthogonal components: semantic constraint (2 levels) and graphic con- 
ditions (3 levels). The pooled residuals of the Latin squares (80 df) was planned to 
provide an estimate of experimental error. 


The orders of presentation to each child of the six experimental passages were 
designed so that two passages in the same graphic condition were presented as a 
successive pair (one normal semantic, the other low semantic constraint). Just prior 
to each of these three pairs, a practice passage was presented in the same graphic 
condition as the experimental pair. The orders of presentation of the three experi- 
mental pairs and associated practice passages were balanced across the six subjects 
of each Latin square. For all subjects, this sequence of passages was preceded by 
two initial practice passages, one semantically norma], the other of low semantic 
constraint, both in lower case print. Practice passages were constructed in a similar 
manner to the experimenta] passages but were of shorter length. 


Procedure 
The reading tasks were administered individually. To obtain & common task 
set appropriate to the measurement of oral reading time, instructions were given to 
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read aloud each passage as fast as possible. Instructions were also given that some 
passages " may sound silly ". Following a ready signal, each passage was exposed 
at normal reading distance on a table top. The time interval from exposure of the 
passage to the child's (initial) response to the last word of the passage was measured 
by a stop-watch. The child was not questioned about the content of the passages. 


Jf during the reading the child failed to make any vocalisation for an interval of 
7 seconds, the investigator pointed to the word which followed the last word read, 
and if the child still gave no response after a further 3 seconds, the investigator pointed 
to the next succeeding word. 


Results 

In summary, there was a positive influence of semantic contextual constraint on 
reading performance under the conditions in which the subjects had no previous 
practice in graphic processing (upper case, mixed case) as well as under less demanding 
conditions (lower case). Both reading time and errors of oral reading were used as 
measures of performance. The details of these results follow. 


Reading Times. Mean reading times in seconds per word are shown in Table 1. 
The times for the upper case and mixed case conditions were greater than 1 second per 
word, which is of critical significance for testing the prediction from the Smith 
account. The main effect of semantic constraint was significant, F (1, 80) — 10-90, 
Р<0:01. Passages with low semantic constraint were read more slowly than passages 
of normal semantic constraint. The main effect of graphic conditions was also 
significant, F (2, 80) = 22:30, P<0-01. The interaction between semantic constraint 
and graphic conditions was not significant, F (2, 80) = 1-02, P>0-05. 

Although the within subjects design did not enable a reliable test of the main 
effect of sex of subjects, tests of interactions with sex were available. None of the 
interactions with sex of subjects were significant (Р. 0-05), except the interaction 
between graphic conditions and sex, Е (2, 80) = 9-46, P 0-01, which indicated that 
although the main effect for graphic conditions was significant, much of this effect 
was contributed by the girls, and not by the boys. 


The pooled interactions with schools was not significant, Е (20, 80) = 1:55, 
P2005. If this were significant, the use of the pooled residuals of the Latin squares 
as the error term would not have been justified. 


Errors. Errors of reading were counted from audio-tape recordings. Words 
were taken as the unit for this count, and word substitutions, insertions and omissions 
were all counted as errors. If there were more than one response to a word, the last 
response was the only one counted. 

TABLE 1 


MEAN READING TIMES AND PERCENTAGE ERRORS FOR 
First-GRADE CHILDREN IN EACH CONDITION 


Graphic condition (Case) 
Semantic condition Lower Upper Mixed 
Mean Reading Time (sec per word) 





Low constraint 0-98 138 145 
Normal 0-88 124 1417 
Mean Percentage Errors 

Low constraint 62 117 11:0 
Normal 58 49 6:3 
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The mean percentages of errors are shown in Table 1. The interaction between 
semantic constraint and graphic conditions was significant, F (2, 80) = 6:47, P «0-01. 
Under the graphic conditions of upper case and mixed case, passages with low 
semantic constraint were read with more errors than passages with normal semantic 
constraint. Under the graphic condition of lower case, the simple effect of semantic 
constraint was not significant, Е (1, 80) = 0-07, Р> 0-05. The obtained interaction 
could merely represent а ‘ flooring effect ’ if an error rate of 5 or 6 per cent were the 
lower limit of error which the 63-year-old subjects can achieve under the task 
conditions. 


None of the interactions with sex of subjects were significant (P 0-05), except 
the interaction between graphic conditions and sex, F (2, 80) = 3-29, P «0-05, which 
indicated that much of the effect of graphic conditions was contributed by the girls, 
and not by the boys. Thus interactions with sex of subjects followed the same pattern 
as found for reading time. Reading performance of girls was negatively affected by 
upper case and mixed case, while that of boys was little affected. Such а sex difference 
was unexpected and warrants further investigation in a study designed specifically to 
examine sex differences in graphic processing. 


The pooled interactions with schools were not significant, F (20, 80) = 0:95, 
P>0-05. 


EXPERIMENT 2 


Experiment 2 was conducted to make (i) a further test of the predictions, using 
a different set of conditions to represent levels of practice in processing graphic 
information, and (ii) a developmental comparison with older children having several 
years’ experience of reading. Unfamiliar cursive handwriting was thus used in 
Experiment 2 to provide a graphic condition on which older children had little or no 
previous practice, in contrast to familiar (school cursive) handwriting and lower case 
type on which the children were practised. 


A pilot investigation indicated that the youngest age at which children of average 
attainment could cope with reading unfamiliar cursive handwriting was 8 years, and 
then only if the vocabulary of the text was not beyond the first grade level. Experi- 
mental passages of this level were administered to two groups: third grade and sixth 
grade. 


Subjects 

There were 24 children in each grade group with equal numbers of girls and boys. 
All were English-speaking Caucasian children, and were randomly selected by age 
and sex from the two schools in Experiment 1. The mean age of the third-prade 
group was 8:11 years (range 8:7 to 9:2). The WISC-R Vocabulary test mean 
scale score for the group was 9-4 (SD — 2-1). The WRAT Reading Level 1 mean 
standard score was 121-2 (SD = 17:3). 


The mean age of the sixth-grade group was 11:10 years (range 11:6 to 12:2). 
The WISC-R Vocabulary mean scale score for the group was 9-0 (SD = 1-9). The 
WRAT Reading Level 1 mean standard score was 119-0 (SD = 16:9). 


Materials 

The verbal characteristics of the experimental passages were exactly the same as 
the passages used in Experiment 1. The difference was in the graphic conditions in 
which the material was presented: lower case type (as in Experiment 1), school 
cursive handwriting, unfamiliar cursive handwriting. 


The schools followed a uniform prescription for the teaching of handwriting. 
Cursive handwriting was first introduced in third grade and the style taught was 
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FIGURE 1 
SAMPLES OF SCHOOL Cursive (Ling 1) AND UNFAMILIAR CURSIVE HANDWRITING (LINE 2) USED IN 
EXPERIMENT 2 


"Su dove doch | 


prescribed in detail (Education Department, 1964), an unadorned style without 
loops. Teaching in all classrooms followed this prescription. Experimental and 
practice passages were produced in this style of school cursive handwriting, the size of 
the writing being matched to the passages in lower case type. The interlinear spacing, 
line length and line segmentation were similarly matched. The passages presented to 
the child were photocopies of the handwritten scripts. A sample excerpt is given in 
the first line of Figure 1. 

Experimental and practice passages were also produced in a cursive handwriting 
style which would not be as familiar to the children as the school cursive. This 
“unfamiliar cursive’ contained loops and some adornment. It was produced in a 
size, line length, and spacing which matched the passages in lower case type. A 
sample excerpt is given in the second line of Figure 1. 


The design of the study was the same as Experiment 1. The procedures of 
administration were also the same. 


Results 

Reading time measures were obtained as in Experiment 1. It was intended that 
an analysis of variance be applied to the data, as in Experiment 1. Unfortunately it 
was found that the error variance for the unfamiliar cursive condition was many 
times greater than for the lower case type and school cursive conditions. This was 
so for both the third and sixth-grade groups. Transformation of the measure into 
another scale was not found to be a satisfactory remedy. The analysis of variance 
was broken into separate components which could provide tests of treatment effects 
without violating the assumption of homogeneity of error variances. The pooled 
Latin squares residual could not be used as an error term as in Experiment 1. The 
error term used in the component analyses was the interaction of Treatments X 
Subjects (within school/sex groups), where the Subjects factor was completely con- 
founded with Passages (Clark, 1973, p. 348; Coleman and Miller, 1974). The 
analysis of each grade group was treated separately, as error variance was not homo- 
geneous across grade groups. 


Third Grade. When reading time and errors are considered together, the results 
show a positive effect of semantic constraint on reading performance, under the 
unfamiliar cursive condition for which the subjects had no previous practice in graphic 
processing. There was a similar effect under the less demanding conditions (school 
cursive, lower case). Mean reading times in seconds per word and mean percentage 
errors are shown in Table 2. The times for the unfamiliar cursive conditions were 
greater (һап 1 second per word, which is of critical significance for testing the 
prediction from the Smith account. 
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TABLE 2 


MEAN READING TIMES AND PERCENTAGE ERRORS FOR THIRD- 
GRADE CHILDREN IN EACH CONDITION 


Graphic condition 
Lower School Unfamiliar 
Semantic condition case cursive cursive 





Mean Reading Time (sec per word) 


Low constraint 0-53 0-56 1-12 
Normal 0-44 0-47 1-08 
Mean Percentage Errors 

Low constraint 2-4 2-3 125 
Normal 18 12 8-0 


By the component analysis for unfamiliar cursive, the effect of semantic constraint 
on errors was significant, F (1, 20) = 12-91, P<0-01, although the effect on reading 
time was пої, Е (1, 20) = 0-43, Р> 0:05. It should be noted, however, that the 
obtained reading times for unfamiliar cursive at both levels of semantic constraint 
were extremely high for children at third grade (being at least as high as in the most 
difficult condition for first grade in Experiment 1). It is apparent that there must be 
some upper limit to the time a child will spend attempting to identify any word. 
When this limit has been reached, the word is omitted or some ‘ best attempt’ is 
made, so that identification of subsequent words may proceed. These omissions, and 
many of the ‘ best attempts’, will count as errors. Now if this upper limit has been 
reached for much of the text in each of the levels of semantic constraint, no differences 
in reading time between the levels can be expected, but any performance differences 
would be revealed in error rates. Such could be the case for the present results of 
third-grade children for the unfamiliar cursive condition. 

The lower case type and school cursive conditions were analysed in a further 
component analysis. The main effect of semantic constraint on reading time was 
significant, F (1, 60) = 33-42, P<0-01, passages with low semantic constraint being 
read more slowly than passages of normal semantic constraint. The mean percentages 
of errors were too close to the lower bound to enable a satisfactory analysis of 
differences (see Table 2). 


Sixth Grade. There was a positive influence of semantic constraint on reading 
performance under each of the three graphic conditions. The results are shown in 


TABLE 3 


MEAN READING TIMES AND PERCENTAGE ERRORS FOR 
SixrH-GRADE CHILDREN IN EACH CONDITION 





Graphic condition 

Lower School Unfamiliar 
Semantic condition case cursive cursive 
Mean Reading Time (sec per word) 
Low constraint 0-41 0-43 0-65 
Normal 0-34 0-37 0-52 
Mean Percentage Errors 
Low constraint 24 29 6-3 
Normal 23 26 35 
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Table 3. By the component analyses for unfamiliar cursive, the effect of semantic 
constraint on reading time was significant, F (1, 20) — 8:30, P «0:01, and also the 
effect on errors, Е (1, 20) = 8:02, P «0-05. 


In the component analysis for lower case type and school cursive conditions, the 
main effect of semantic constraint on reading time was significant, F (1, 60) = 57:68, 
P«0-01. The mean percentages of errors were too close to the lower bound to 
enable a satisfactory analysis of differences (see Table 3). 


DISCUSSION 


In Experiments 1 and 2, for the conditions in which reading was slower than 1 
second per word, the children made considerable use of semantic context for word 
identification, relative to conditions in which reading was faster. This finding is not 
consistent with the theoretical account of Smith, which predicts that children reading 
at such slow rates will be able to make little, if any, use of semantic context for word 
identification. 


In examining the prediction from the extension of the La Berge and Samuels 
account, the results for first grade (Experiment 1), as well as third grade (Experiment 
2), showed a positive effect of semantic contextual constraint on word identification 
performance, under conditions of graphic processing in which the subjects had no 
previous practice. The effect on word identification performance was at least as 
great under these conditions as it was when graphic processing was well practised, as 
in the case of the sixth-grade children reading lower case print. The results are not 
consistent with the theoretical account based on attentional limitations. Although 
no attempt was made to measure the attention which children gave to the sources of 
information, the study has tested a prediction from the theory about the effects of 
levels of practice. The reader who is not well practised in processing the graphic 
information will give most available attention to that processing. Little attention 
would be available for using semantic context and thus it is predicted that the reader 
would be unable to make much use of that source of information. The prediction 
was not confirmed. 


The conclusion is that any restrictions on the reader's use of semantic contextual 
information which arise from limitations of capacity of either memory or of attention 
do not appear to be such critical limitations as has been supposed in the account of 
Smith, or in the Schvaneveldt et al. extension ої the La Berge and Samuels account. 
The findings do not of course preclude the existence of attentional or memory limi- 
tations on reading performance, but such limitations are not as critical as the specific 
properties of the two theoretical accounts would imply, at least under conditions in 
which the reader makes his own temporal organisation of responses. The present 
study does not address the question of whether the theories give adequate accounts 
under other conditions, such as experimenter paced reading. 


Consideration needs to be given to alternative accounts of reading which do not 
specify such critical limitations on the use of contextual information. One such 
account is that of Morton (1964, 1969, 1979). In this account it is assumed that all 
available contextual information is used by the reader, but only as much graphic 
information is used as needs to be added to the contextual information to determine 
a word identification response. It is also assumed that although graphic information 
is transitory, the contextual information is sustained as the reader proceeds through 
the text (Morton, 1969, p. 166). Thus any memory limitations on the use of 
contextual information would not be critical. Moreover, in the Morton account the 
means of using contextual information is common to both aural reception of speech 
and to reading. Therefore the child who has experience in reception of speech will 
be able to use contextual information effectively when attempting to process graphic 
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information in reading, even with little experience of reading or the particular graphic 
conditions. The general notions of the Morton account appear to be consistent with 
the present findings, although no attempt has been made to test detailed quantitative 
predictions. While the Morton account is far from being a fully developed theory of 
reading, at least some features of it warrant attention in the task of theory construction. 
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А COMPARISON OF TEACHERS’, PUPILS’ AND 
PARENTS' ATTRIBUTIONS REGARDING PUPILS' 
ACADEMIC ACHIEVEMENTS 


Ву D. BAR-TAL AND J. СОТТМАММ 
(Tel-Aviv University, Israel) 


SUMMARY. This study compared attributions of teachers, pupils, and parents regarding 
pupils’ academic success or failure. Eighty pupils were asked to indicate the extent to 
which each of the 10 given causes influenced their achieved grade. In addition, 8 teachers 
and 50 parents were given the same questionnaire and were asked to explain the achieve- 
ments of the pupils. The results showed that teachers attributed pupils’ success mainly 
to themselves and to the pupils, pupils attributed their success mainly to themselves 
and to the teacher, and parents attributed their children's success mainly to themselves 
and to the teacher. In case of failure, teachers and pupils shared the responsibility for 
the outcome with each other and with the parents, while the parents shared the responsi- 
bility mainly with the pupils or cited external causes. 


INTRODUCTION 


RECENTLY, а number of studies have investigated teachers' attributions of causality 
for the success and failure of their pupils (e.g., Beckman, 1970, 1973; Ames, 1975). 
The importance of these studies is based on their showing evidence that the causes 
which teachers cite to explain pupils’ achievement outcomes may have an effect on 
their expectations concerning the pupils’ future achievements (Bar-Tal, 1979). As 
a result, these teachers’ expectations may, in fact, influence pupils’ academic per- 
formance (Rosenthal and Jacobson, 1968). 


Teachers may attribute pupils’ achievements to themselves (e.g., own teaching 
ability, own motivation), to pupils (pupils? ability, invested effort), or to external 
causes (difficulty of test, luck). These causes can be classified on a stability dimension 
(see Weiner, 1974), whereby some causes are stable over time (e.g., ability, task 
difficulty) and others are unstable and may change in the future (effort, luck). Thus, 
the teachers' attribution of an outcome to stable causes results in an expectancy that 
the same outcome will be repeated, since stable causes do not change over time. An 
attribution of an outcome to unstable causes results in an increased expectancy of a 
different outcome, since unstable causes can be modified (Weiner, 1974). "These 
different expectations can then be transmitted to the pupils, and often the pupils will, 
in turn, conform to these expectations. 


The results of studies investigating the teachers' ascription of causality regarding 
pupils’ academic outcomes have been somewhat contradictory. While Johnson er al. 
(1964), Beckman (1970), and Brandt et al. (1975) found that the pupils" performance 
may lead the teachers to ego-enhancing and ego-defensive causal perception of pupils’ 
successes and failures, Beckman (1973), Ross et al. (1974), and Ames (1975) did not 
find such tendencies in teachers’ causal perceptions. The former studies found that 
teachers tend to take credit for pupils’ successes and tend to attribute failures to causes 
external to them. The latter studies showed that teachers tend to take responsibility 
for their pupils’ failures but give credit to pupils if they succeed. Generally, the 
differences in the results of these studies can be attributed to the various experimental 
conditions which each study used. For example, in the studies of Johnson et al. 
(1964), Beckman (1970), and Brandt et al. (1975), the pupils were hypothetical cases; 
in the studies of Beckman (1973) and Ross et al. (1974), the subjects could see the 
pupils, but could not interact with them; and in Ames’ (1975) study, the subjects 
actually interacted with the confederate (the pupil) for fifteen minutes. All the 
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studies were carried out in a laboratory setting in which the subjects (sometimes 
college students) were instructed to teach one or two children for a short period of 
time. 

In view of these limitations regarding an ecological validity of these studies, the 
present study was designed to extend the scope of investigation of causal perception 
of teachers to real situations in school settings. In addition, the present study adds 
a new direction to the investigation by comparing the attributions of teachers, their 
pupils, and their pupils’ parents concerning the pupils’ academic outcomes. 

A number of studies which explored individuals’ causal perceptions of their 
successes and failures found that, in general, individuals tend to attribute their successes 
to themselves and their failures to external causes (e.g., Simon and Feather, 1973; 
Nicholls, 1975). Clearly, then, the pupils’ attribution pattern may well conflict with 
the teachers’ ascriptions of causality regarding pupils’ success or failure. Since 
teachers are active actors in the teaching-learning process, their pupils’ academic 
achievements are often regarded as reflecting the teachers’ own successes or failures. 
Therefore, teachers might well be expected to attribute a pupil’s success to themselves 
and a pupil’s failure to external causes. Also, we would assume that the attribution 
pattern of pupils’ parents will be somewhat similar to that of their children, since 
parents tend to believe that they exert a great influence on their children’s achieve- 
ments through genetic transmission and through socialisation processes. Thus, the 
attributions of academic achievements to pupils’ internal causes (e.g., ability, 
personality) are believed to reflect parents’ own internal characteristics. Parents can, 
therefore, give credit to their children for their achievements and still experience 
ego-enhancement. 


Specifically, it was hypothesised that, (а) whereas teachers would tend to attribute 
their pupils’ success mainly to their own teaching ability, pupils would tend to 
attribute their success mainly to their own abilities, and parents would attribute 
their children’s success mainly to internal characteristics of their children and to 
their influence, and (5) whereas teachers would tend to attribute their pupils’ failures 
mainly to characteristics of the pupils, pupils and parents would tend to attribute the 
same failures mainly to reasons other than themselves. The present study also com- 
pares teachers’, pupils’, and parents’ evaluations of the achievement outcome, their 
feelings of satisfaction concerning the outcome, and their expectations regarding 
future academic achievement. 


METHOD 

Subjects 

Eight female 4th and 5th grade mathematics teachers, 69 of their pupils (35 males 
and 34 females), and 69 parents (59 fathers and 10 mothers—one of each pupils’ 
parents) participated in the study. Originally, 80 pupils were randomly selected for 
the study (10 pupils—5 males and 5 females—from each of eight different classes in 
the same school). For consistence of parental response, the experimenter attempted 
to contact only the fathers of the pupils. However, because of a lack of availability 
of the fathers in 10 cases (due to divorce, military reserve service, or work schedule), 
10 mothers were asked to fill out the questionnaire. In 11 other cases, neither parent 
completed the questionnaire, and in our analysis those 11 pupils were dropped from 
the sample, leaving 69. The school itself serves a middle class neighbourhood, and 
the pupils selected were homogeneous with respect to demographic characteristics. 


Procedure 

After the children were informed of their autumn trimester grade in mathematics, 
they were asked to fill out a questionnaire in the classroom. The teacher left the 
classroom and the experimenter distributed printed instructions and the questionnaire. 
The experiment was presented as a study of pupils’ thoughts and feelings regarding 
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their grades. The pupils were told that there were no right or wrong answers and that 
their answers would not be made known to the school authorities or to their parents. 


Teachers and parents were contacted in their homes during evening hours and 
asked to fill out the questionnaire after receiving the same printed instructions from 
the experimenter. The instructions provided by the experimenter were similar to the 
ones given to the pupils. The questionnaire given to pupils, teachers, and parents 
contained the same questions. 


Questionnaire 

The questionnaire was similar to the one used by Bar-Tal and Darom (1979) and 
was structured on the basis of a pilot study in which 20 pupils were asked on an 
open-ended questionnaire to list causes which could have contributed to the grade 
received. The pupils for the pilot study were drawn from the same population as the 
subjects. АП causes which were mentioned by at least two pupils were used in the 
list of the present study. 

The questionnaire consisted of three major parts. First, the subjects were asked 
to evaluate the mathematics trimester grade of the pupil as a success or failure. 
Second, the subjects were asked to indicate the extent to which each of 10 listed causes 
influenced the achieved grade. The answers were given on a five-point scale ranging 
from (5) very great influence to (1) very little influence. The 10 causes listed were: 
ability in mathematics, interest in mathematics, difficulty of the material in mathe- 
matics, effort exerted to study mathematics during that trimester, quality of the 
teacher's explanation, conditions of study at home, help given by parents, luck, 
diligence, and the difficulty of the exams on which the grade was largely based. 
Finally, the subjects were asked to indicate on a four-point scale the extent to which 
they felt satisfied with the grade and to indicate on a five-point scale their expectation 
regarding the mathematics grade in the next trimester. 


Methods of analysis 

Two sets of statistical analyses were performed with the data regarding causal 
attributions. One compared the causal perception of success and failure among the 
three groups (teachers, pupils and parents), and the other compared the causal 
perceptions in case of success and failure within each of the groups. The statistical 
analyses (ANOVAS) comparing the causal perceptions of the three groups were 
performed only with those cases in which an agreement was found among the teachers, 
pupils, and parents as to their evaluations of the grade as success or failure. This 
restriction was used because of the repeated measure nature of the data (i.e., a teacher, 
a pupil, and a parent evaluated the same grade). The statistical analysis which com- 
pared the attributions in case of success with attributions in case of failure, within 
each of the groups, was carried out with all the cases, since the agreement among the 
three groups is irrelevant for this type of analysis. 


RESULTS 


Several statistical analyses were performed in order to compare pupils’, teachers’, 
and parents’ evaluations of the causal attributions of academic performance, feelings 
of satisfaction, and expectations for future performance. The analyses indicated that 
there was no difference in answers between male pupils and female pupils nor between 
mothers and fathers on the dependent measures. Therefore, the data were combined 
with regard to these two independent variables. 


Evaluation of performance 

In order to compare teachers’, parents’, and pupils’ evaluation of the grades as 
success or failure, three bivariate distributions (parents-teachers, parents-pupils, 
pupils-teachers) were tabulated and are presented in Table 1. McNemar tests for 
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TABLE 1 


BIVARIATE DISTRIBUTIONS OF PUPIL-TEACHER, PUPIL-PARENT AND 
TBACHER-PARENT PERCEPTIONS OF SUCCESS AND FAILURE 
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differences between correlated proportions of perceived success revealed significant 
differences between pupils and teachers y? (1) = 6:14, P<0-05; between pupils and 
parents y? (1) = 18-10, P<0-001; and between teachers and parents y (1) = 9-80, 
P<0-01. 


Causal attributions 

Comparisons within the groups. First, Figure | illustrates the detailed attributional 
patterns of teachers, pupils, and parents for perceived success and for perceived failure. 
Teachers tended to attribute pupils’ success mainly to pupils’ diligence, effort, interest, 
and their own quality of explanations; pupils tended to attribute their own success 
mainly to their own efforts, their teacher’s explanations, and their own diligence and 
ability; and parents tended to attribute their children’s success mainly to home 
conditions and teacher’s explanations. Failure was attributed by teachers mainly to 
pupils’ low efforts, difficulty of the material, and home conditions inappropriate for 
studying; by pupils mainly to lack of parents’ help and difficulty of tests; and by 
parents mainly to inappropriate home conditions and child’s low level of interest and 
ability. 


In order to test the study’s main hypotheses, the 10 causal attributions were 
combined into four groupings: pupil-related causes, external causes, teacher-related 
causes, and parent-related causes. Table 2 presents the means and statistical differences 
between them within each group of subjects. Multiple t-tests revealed that in the case 
of failure, teachers, pupils, and parents evidenced a similar attributional pattern in 
that they all attributed failure in some degree to themselves as well as to the other 
three groups of cases. For example, pupils attributed failure to themselves as much 
as to their teachers, their parents, and external causes. They did, however, find their 
parents to have greater influence on their failure than did external causes. The only 
exception to this pattern was found among the parents who considered themselves as 
having significantly more to do with their children’s failure than the children’s 
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teacher. In the case of success, teachers and pupils attributed success to themselves 
to the same degree as to each other, and parents and external causes were considered 
to beless influential. Parents also tended to attribute success to themselves; however, 
they shared credit not with their children, but with the teacher. 


Comparison among the groups. An analysis of variance with factorial design 
(3 x 2) was performed to compare the causal perceptions of success and failure among 
the three groups. The within-subjects factor was Group of Evaluators (1.е., teachers, 
pupils, parents) and the between-subjects factor was Outcome (1.е., success, failure). 
This analysis was carried out with 42 cases in which there was agreement among a 
teacher, pupil, and parent in their evaluations of the grade either as success or failure. 
(In 32 of the cases there was a three-way agreement regarding the evaluation of the 
grade as a success, and in 10 there was agreement regarding the evaluation of the 
grade as a failure.) Table 3 presents the means for each attributional cause as 
perceived by teachers, pupils, and parents. 


The results of the analysis indicated that for six of the causal attributions the 
means across evaluators’ groups were significantly higher in the case of success than 
in the case of failure: ability, F (1, 60) = 23-85, P<0-01; interest F (1, 60) = 17-63, 
P<0-01; effort, F (1, 60) = 3-96, P<0-05; teacher’s explanations, F (1, 60) = 14-87, 
P<0-01; diligence, F (1, 60) = 17-06, P<0-01; and luck, Е (1, 60) = 3-72, Р<0:05. 
Significant differences in the importance attached to the various causes by teachers, 
pupils and parents were found with regard to material difficulty, F (2, 60) = 4-09, 
P<0-05; luck, Е (2, 60) = 14-03, P<0-01; diligence, Е (2, 60) = 4-36, P<0-05; 
teacher's explanation, Е (2, 60) = 14-87, Р<0:01; and effort, F (2, 60) = 3-52, 
P«0-05. Tukey’s HSD test with 0-05 criterion of significance was used to investigate 
which of the pair-wise post-hoc comparisons is significant. It was found that with 
regard to material difficulty, teachers attached more influence to this cause than did 
either pupils or parents, while no difference between the latter two was detected. 
Pupils and parents, on the other hand, felt that luck played a more influential role 
than did teachers. Again no difference was found between pupils and parents. 
Diligence was thought to be more influential by teachers than by either the pupils or 
the parents. The pupils attributed greater influence to teacher’s explanation than did 
either the teachers themselves or the parents—the latter two not differing in their 
perception with regard to this cause. With regard to the effort exerted in studying, 
both teachers and pupils felt that it was more influential than did the parents. 
Significant interactions were also manifested with respect to teacher’s explanation 
and parents’ help. The first interaction indicates that while the teachers and the 
parents cited teacher’s explanations more for success than for failure, pupils attributed 
a similarly high influence for this cause in both cases of success and failure, F (2, 60) = 
10:12, Р<0:01. The second interaction indicates that while pupils perceived parents 
help as more influential in the case of failure than of success, teachers perceived this 
cause as more influential in the case of success than of failure, F (2, 60) = 7-03, 
Р<0:01 (although in general, teachers evaluated the influence of this cause as low). 


Feelings of satisfaction and expectation 

In order to compare teachers’, pupils’, and parents’ feelings of satisfaction with 
the grade received and their levels of expectation for future outcome, a one way 
analysis of variance for each of these variables was carried out. The first analysis 
indicated that there was no difference among the means of satisfaction of the three 
groups. The second analysis with respect to expectations yielded a significant result, 
F (2, 67) = 39-12, P<0-001. Duncan’s post-hoc comparison with 0-05 criterion of 
significance showed that teachers’ expectation for future performance was significantly 
lower (М = 2:91) than either pupils’ (М = 3-68) or parents’ (М = 3-79) expectation, 
while there was no difference found between the latter two groups. 
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DISCUSSION 


The results of the present study indicate that causal explanations by teachers, 
pupils, and parents regarding pupils’ academic success and failure differ especially 
in situations of success. In the case of success, teachers attributed the outcome 
mainly to themselves and to the pupil, while failure was attributed to a variety of 
causes. Pupils attributed success mainly to themselves and the teacher and failure 
to a variety of causes as well. Parents attributed success mainly to the teacher and 
themselves and failure to different groups of causes. Looking at the means of the 
attributions in Table 2 and at Figure 1, the specific patterns of attributions are as 
follows: (a) teachers tended to attribute pupils’ success mainly to pupils’ diligence, 
effort, interest, and to their own quality of explanations, while they attributed pupils’ 
failure mainly to pupils’ exerted efforts, difficulty of material, and home conditions; 
(b) pupils tended to attribute their success mainly to their own efforts, teachers’ 
explanations, and their own diligence and ability, while they attributed failure mainly 
to a lack of parent’s help and difficulty of tests; (c) parents tended to attribute 
their children’s success mainly to home conditions and teacher’s explanations, while 
are attributed failure mainly to home conditions, and child’s lack of interest and 
ability. 


These results only partially confirm the predictions. The teacher appears to be 
the most influential cause of success; the teachers themselves shared the credit for 
success with pupils, while pupils and parents, each attributed the success primarily 
to the teacher and then to themselves. Thus, each group perceived the teacher as 
having the major responsibility in the successful outcome of the pupil. However, the 
attributional patterns of all three groups can be seen as somewhat ego-enhancing 
because each group attributed the success also to itself. The causal perception of 
parents in case of success is somewhat surprising. They do not share the credit for 
success with their own children, but with the teacher. It is possible that parents 
perceive the teacher as a very powerful agent who can improve the children’s achieve- 
ments. 


In the case of failure, each group distributed the responsibility among a variety of 
causes, but the external causes were considered as the least influential, especially for 
teachers and pupils. Thus, teachers, pupils, and parents mostly divided the responsi- 
bility for pupils' failure among themselves. Also, it should be pointed out that the 
patterns of pupils' attributions appear to be more similar to teachers' attributions than 
to parents’ attributions. This latter result is in line with the Каму (1977) findings 
which indicate that teachers more than parents affect pupils’ perceptions of their role 
in the classroom. Indeed, Bar-Tal (1979) reviewed evidence which suggests that 
teachers influence pupils’ attributions through verbal communication. 


It should be pointed out that an alternative explanation of these results is possible 
—that is, that the attributional patterns of teachers, pupils, and parents are based on 
information-processing analysis and not on motivational, biased tendencies (see 
Miller and Ross, 1975). In accordance with this interpretation, teachers’ and pupils’ 
similar attributions might merely reflect the reality of the testing situation in the 
classroom, while parents' differing attributions would be based on the information 
that they receive. On the basis of the present data, it is impossible to determine which 
explanation is more satisfactory. Future research should direct more attention to 
exploring this point. 

Comparisons among teachers', pupils', and parents' attributions performed only 
on the cases of agreement regarding the outcome showed that: (a) material difficulty 
and diligence of the pupil were considered to be more important causes of success and 
failure by teachers than by pupils and parents; (5) effort exerted by the pupil was 
considered to be a more influential cause of success or failure by teachers and pupils 
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than by parents; (c) luck was considered to be a more influential cause of success or 
failure by pupils and parents than by teachers; (d) teacher’s explanation was considered 
to be a more influential cause of success by teachers and parents than by pupils; (e) 
pupils considered teacher's explanation as a similarly important cause of success and 
of failure; and (f) pupils considered parents' help (lack of it) as more influential in 
failure than success. These results, which can be treated only comparatively among 
the three groups, are in line with the previously described findings. Although teachers 
recognised the invested effort and the diligence of the pupils as influencing the outcome, 
they generally tended to consider their own teaching ability as especially influential in 
the case of pupils’ success. Pupils tended to blame the parents for their failure and 
generally considered their invested effort and luck as influencing. Parents agreed, as 
in the previous analysis, that the teacher plays an important role in pupils’ success 
and agreed with the pupils that luck also is, in general, an influencing cause. 


Other findings of tbe present study indicated that pupils tended to evaluate their 
grades as less successful than teachers and parents, while parents tended to evaluate 
pupils’ grades as more successful than teachers. However, pupils and parents had 
higher expectations regarding future success than teachers. These two findings seem 
to be related. As the results of other studies indicate (e.g., Simmons and Rosenberg, 
1971) pupils tend to have often unrealistic, high expectations regarding future success 
and, therefore, the achieved outcome may be below the expected grade. As a conse- 
quence of such tendency, they may evaluate their outcome as being less successful than 
do their teachers or parents. 


In sum, the results of the present study did not replicate completely the findings 
of laboratory experiments which investigated teachers’ attributions. The results 
showed neither extreme ego-enhancing nor ego-defensive attributions of teachers and 
showed neither extreme self-enhancing пог self-blame attributions on the part of the 
pupils. It is possible that in real-life situations teachers exhibit a combination of both 
patterns. Recognising the importance of pupils’ causal perceptions of success and 
failure for an understanding of pupils’ achievement-related behaviour (Weiner, 1974) 
and the effect of teachers’ attributions on pupils’ causal perception (Bar-Tal, 1979), 
further research is needed to explore how the attributions of teachers, pupils, and 
parents are formed and how they are related to achievement. 
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LEVEL I AND LEVEL II ABILITIES IN PRIMARY 
SCHOOL CHILDREN 


Ву А. J. MACKENZIE 
(Counselling Service, The University of Western Australia) 


SUMMARY. Jensen’s (1970) two-level theory of ability relates intelligence and rote 
learning to socio-economic status (SES) and educational achievement. In order to 
test four hypotheses from the theory, two tests of Level I ability (Digit Span, and 
Paired Associates) and two tests of Level II ability (Peabody Picture Vocabulary Test, 
and Raven's Matrices) and a questionnaire were administered to 525 fifth grade, 
Australian primary school children. Only one hypothesis was supported: that Level II 
ability is more strongly associated with SES than Level I ability. The utility of Jensen's 
suggestion that Level I ability should be more fully exploited in the education of disad- 
vantaged children is questioned. 


INTRODUCTION 


BECAUSE of widespread interest in reducing the educational gap between middle and 
lower SES children, the relationship between intelligence and learning ability has 
become a problem of concern to several researchers. According to Tyler (1972), 
Jensen's (1970) two-level theory of ability is one of the major attempts to understand 
this problem. With respect to the implications of the theory for educational policy, 
Cronbach (1975) suggests that the controversy generated by the theory may be 
unparalleled in the history of mental testing. 

Of particular interest to educators is Jensen's suggestion that Level I ability, 
which includes associative and rote memory, should be more fully exploited than at 
present in teaching low SES and disadvantaged children. The reason for this sug- 
gestion is that low SES children differ little, if at all, from middle SES children in 
Levell ability whereas they are generally found to be somewhat inferior in conceptual 
learning and reasoning ability, which are examples of Level II ability (Jensen, 1970). 


Most of the evidence directly relevant to the two-level theory is North American 
in origin, and there is little direct evidence that data from other countries support the 
theory. The theory depends upon evidence relating ability and achievement test 
scores, verbal learning data, and SES ratings. Differences between American and 
Australian society in education and other institutions could conceivably affect this 
kind of data despite some obvious similarities between the two societies. Further, 
confounding of race differences and SES in many of the American studies provides 
additional justification for examining Jensen's theory with racially homogeneous 
Australian data. 

The two-level theory postulates two broad and fundamental cognitive processes 
which are assumed to be genetically distinct and which find expression to a greater or 
lesser degree in a variety of cognitive abilities (Jensen, 1970). Few tasks tap only 
Level I or Level II ability; most can be thought of as occupying a position along a 
dimension for which Level I and Level II represent the poles. 


Level I, which comprises rote and associative processes, is defined by Jensen 
(1974) as the ability to register and retrieve information with fidelity. Levell abilities 
are characterised by a relative lack of transformation, conceptual coding, or other 
mental manipulation between information input and output. Remembering a 
telephone number for the few moments necessary to dial it is a good example of a 
task that is a fairly pure measure of Level I ability, although Level I ability is not 
restricted to short-term memory tasks. Level П tasks are more complex. They 
include conceptual and reasoning processes and are characterised by mental manipu- 
lation and transformation as opposed to simple reproduction of information. 
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Level П ability is similar to Spearman’s 2, the general intelligence factor common 
to most complex tests of mental ability; however Jensen’s (1978) theoretical account 
of g is somewhat different from Spearman’s. 


To Jensen, the best tests of Level IT ability are non-verbal, ‘ culture fair ’ intelli- 
gence tests, such as Raven’s Matrices. Vocabulary tests are also regarded as good 
measures of Level II ability, just as they are good measures of g. In Jensen’s view 
vocabulary tests measure Level II ability because they reflect the efficiency and 
accuracy with which vocabulary has been acquired, and acquisition of vocabulary 
appears to depend largely on inferential processes in which the meaning of a word is 
arrived at by reasoning from context rather than by rote learning of associations 
between words and objects or other words. 


Jensen’s theory hypothesises that Level IZ processes are functionally dependent 
on Level I processes, although only to a small degree. For example, in order to solve 
an arithmetic problem which involves reasoning (Level IT), it is first necessary to be 
able to hold the relevant facts in immediate memory (Level I). Actually, this view of 
the relationship between Level I and Level IT represents a change in the theory which 
originally hypothesised that a sufficient degree of Level I ability was a prerequisite for 
the development rather than the utilisation of Level П ability. The earlier view implied 
that individuals possessing high Level П ability together with low Level I ability 
would not be found, but there is now evidence that this pattern of ability is common, 
at least within the normal range of individual differences in Level I and II ability 
(MacKenzie, 1980). А 

Jensen (1970) suggests that teaching in schools is predominantly directed towards 
development of Level II ability, and educational achievement depends much more 
heavily on possession of Level П than Level I ability. There is ample correlational 
evidence consistent with this view: for example, measured educational achievement 
correlates more highly with intelligence test scores than with tests of associative 
memory (Lavin, 1965). It follows from this that, to the extent that SES is related to 
educational achievement, a society's educational sorting mechanism should cause 
middle and low SES groups to differ substantially in Level II ability and much less, 
if at all, in Level I ability. 


Because of the potential importance of the theory, the controversy surrounding 
it, and the paucity of Australian data, the following four hypotheses from the two- 
level theory were examined. They are: 


(1) Measures of Level П ability are associated with SES differences, whereas 
measures of Level I ability are independent of SES, or are associated with it 
to a lesser degree (Jensen, 1970). 


(2) The correlation between measures of Level I and Level II ability is greater in 
middle than low SES groups, provided Level lI ability is measured by tests 
of fluid intelligence in the sense that Cattell (1963) uses the term (Jensen, 
1974). 

(3) For low SES subjects, Level I ability is more variable for children of low 
Level II ability than for children of high Level II ability (Jensen, 1970). 

(4) Level II ability is positively skewed for low SES children and negatively 
skewed for middle SES children. In contrast, Level I scores are not skewed 
in middle or Jow SES groups (Jensen, 1970). 


METHOD 


All fifth grade children (N = 525; mean age 10: 6 years) in five primary schools 
in the outer south-eastern suburbs of Melbourne, Australia completed four group 
tests and a questionnaire. 
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1 


Tests 
Standard Progressive Matrices (SPM). А version of Raven's Matrices (ACER, 


1958). 

Peabody Picture Vocabulary Test (PPVT). Administered as a group test since 
there is evidence that this does not significantly affect the results for children of 
Grade 3 and above (Norris et al., 1960). Forty-five items and four practice items 
covering a wide range of difficulty and omitting any unfamiliar Americanisms were 
chosen from Form B of the test. The score was the number of correct items. 


Digit Span (DS). Аз described by French et al. (1963). Each item comprised a 
sequence of randomly ordered, tape-recorded digits. Presentation rate was one digit 
per second. The test comprised 25 items containing from 3 to 9 digits each, and four 
practice items. The score was the sum of the number of digits in all items correctly 
reproduced. 


Paired Associate Learning (P-A). Modelled on the Associate Memory tests in 
French et al, (1963). Twenty-eight paired associate items were presented in four 
separate lists. There were two lists of number-word pairs, one of adjective-noun 
pairs, one of first and last names, and a practice list. One minute and five seconds 
study time was allowed for each list, and one minute for writing answers. The score 
was the total number of correct items. Spelling errors were disregarded. 


In terms of Jensen's two-level theory, both the SPM and PPVT measure Level II 
ability. Digit Span and P-A measure LevelIability. Digit Span is Jensen’s preferred 
measure of Level I ability, whereas Р-А is regarded as a less pure measure which may 
shift towards the Level П end of the Level I-Level II continuum if subjects use 
conceptual mediating techniques. Should that occur, the P-A test could be expected 
to show a degree of association with SES intermediate between that shown by DS 
and the measures of Level II ability. 


Quesiionnaire. Each child completed a 14-item questionnaire to identify children 
who spoke а language other than English at home. Ninety such children were 
identified and their results were analysed separately from the other 435 children. 


Information from the questionnaire concerning the nature of the work their 
parents performed was used in conjunction with school records to establish parents' 
occupations. The occupation of the principal bread-winner in the family was given 
а rating of occupational prestige from the Broom and Lancaster Jones (1969) scale 
which served as an index of SES. The Broom and Lancaster Jones scale has been 
widely used and provides ratings of occupational prestige that are in broad agreement 
with the results of similar studies carried out in America (Hodge et al., 1964). The 
original 16-point scale was condensed to a 6-point scale (ACER, 1973) which is 
sufficiently discriminating for present purposes. Children in the upper three categories 
of the scale (professional, managerial, and clerical) were classified as middle SES 
(N = 182), and the others (skilled tradesmen, semi-skilled and unskilled workers) 
were classified as low SES (М == 253). 


Administration 

The tests and questionnaire were administered in two one-hour sessions separated 
by an interval of one week. Tests were administered to groups of no more than 18 
children, and each child sat alone at a separate desk in order to prevent answers being 
copied. Questionnaire answers were checked individually with each child. 


RESULTS 
Abilities and SES 
Test results for the 435 children in low and middle SES groups combined were 
correlated with ratings on the six point occupational prestige scale. The product 
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* moment correlations are 0-30 for РРУТ, 0-23 for SPM, 0-17 for DS, and 0-15 for 
P-A. All are significant at the P «0-01 level. The Level П tests appear to correlate 
more strongly with SES than do the Level I tests, as Jensen predicts, but the difference 
is significant only for one of the Level II tests, the PPVT. The correlation between 
PPVT and SES is significantly greater than that between DS and SES, 1 (432) = 2:30, 
P «0:05, and between P-A and SES, t (432) = 2:69, P «0-01. In contrast, the corre- 
lation between SPM and SES does not differ significantly from the correlation between 
either measure of Level I ability and SES, the larger value of t being: г (432) = 1-43, 
P2 0-10. 


Table 1 compares mean test results for 182 middle SES and 253 low SES children. 
The size of the difference in sigma units (sigma being the combined within groups 
variance) between the two groups is 0-52 sigma units for РРУТ, 0:34 for SPM, 0:28 
for DS and 0:21 for P-A. These differences are all significant at the P «0:01 level or 
higher, the lowest value of t (432) being 3-11. These results, which provide evidence 
of a slight to moderate relationship between SES and measures of both Level I and 
Level II ability, confirm the correlational results. 


The measures of Level I and Level If are themselves significantly correlated 
(Table 2). When the effect of each measure of Level II is partialled out separately, 
the residual correlation between each Level J test and SES is insignificant for three of 
the four possible combinations of Level I test and Level П covariate. Only the partial 
correlation of 0-11 between SES and DS, with SPM partialled out, remains significant, 
and then only at the 0:05 level. The other residual correlations are less than 0-1 and 
are not significant. These results suggest that only Level TI ability is associated with 
SES to more than a trivial degree. Most of the association between SES and tests of 
Level I ability can be accounted for by the correlation between tests of Level I and 
Level П ability, a result consistent with Jensen's theory. 


Test intercorrelations 

Table 2 shows product moment correlations between group test scores for middle 
and low SES children separately, with correlations for middle SES children above the 
diagonal and low SES children below the diagonal. АП but one of the twelve corre- 
lations are significant. 


Correlations between measures of Level I and Level II ability are of similar 
magnitude in both low and middle SES groups. The correlations most relevant to 
the two-level theory are those between DS and PPVT (0-23 for middle SES and 0-28 
for low SES), DS and SPM.(0-19 for middle SES and 0:31 for low SES), P-A and 


TABLE 1 
Test RESULTS FOR 435 MIDDLE AND Low SES CHILDREN 


Test 
PPVT SPM P-A DS 

Middle SES 

(N = 182) 
Mean 2781 36-48 18-20 73-88 
SD 4:57 7-58 5-49 20-95 
Low SES 

QN = 253) 
Меап 24:27 32:62 16:55 65: 


60 
SD 497 8:33 5:40 20:48 
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TABLE 2 


Propucr MOMENT CORRELATIONS BETWEEN TEST RESULTS 
FOR 182 CHILDREN or MIDDLE SES (ABOVE DIAGONAL), AND 253 
CHILDREN or Low SES (BELGW DIAGONAL) 





Test 
PPVT SPM P-A DS 
PPVT 0:48** 0:27** 0-23** 
SPM 0 44** 0-28** 0-19* 
P-A 0-23** 0-24** 0-14 
DS 0-28** 0:31** 0-28" 


*P<0-05 че P<0-01 


SPM (0:28 for middle SES and 0:24 for low SES). The differences between corre- 
lations in each SES group are small and in opposite directions for the two measures 
of Level I ability. None of the differences is significant, z does not exceed 1-40, 
P>0:10. These results fail to support Jensen’s theory. 


Jensen (1974) suggests that it is preferable to examine the regression of Level I 
on Level II rather than correlations between measures of Level I and Level II in each 
SES group. However, examination of the regression of Level I ability on Level II 
ability using z scores confirms the conclusion drawn from the correlation data, The 
coefficient of slope for the regression of DS on PPVT is 0:21 in the middle SES group 
and 0:27 in the low SES group. The corresponding coefficients for the regression of 
DS оп SPM are 0-19 and 0-32. The coefficients in middle and low SES groups are 
not significantly different, and the trend is in the opposite direction to that predicted 
from the two-level theory. 


Skewness 

Table 3 shows third moment indices of skewness for distributions of test results 
for middle and low SES children, separately and combined. Significance was 
determined using the formula given by McNemar (1962, p. 78). Because direction 
of skewing is predicted in the two-level theory, a one-tailed test of significance is 
appropriate; however, for the data in Table 3, the significance levels are the, same for 
one-tailed and two-tailed tests. 


The data in Table 3 show little evidence of skewing for any test other than the 
SPM which is consistently negatively skewed to a moderate degree. With low and 


TABLE 3 
TARDO MOMENT INDICES OF SKEWNESS IN TEST RESULTS 


Test 

PPVT SPM P-A DS 
Middle SES —0:16  '—105* | —0:20 0-28 
(М = 182) 
Low SES —0:24 —0.63** —0-31* 0-05 
(N = 253) 
Total —024* —0-78** | —025* 0-16 
(N = 435) 


*P<0-05 ** Р< 0.01 
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middle SES children combined to form one group, there is significant negative 
skewing for three tests, but it is slight except for the SPM. The data provide no 
evidence that Level II tests are skewed in different directions in low and middle SES 
proupr Nor is there consistent evidence that Level I tests are more skewed than 
Level I tests. 


Dispersion of Level I ability 

For low SES children, the dispersion of Level I scores was examined for sys- 
tematic change over the range of scores for each Level П test. This was done by 
determining the dispersion of Level I scores around the regression line of Level I 
ability on Level II ability (the standard error of estimate) for each quartile of the 
distribution of Level П scores. The results are shown in Table 4. Differences between 
standard errors are irregular and relatively small. Hartley’s test of homogeneity of 








TABLE 4 
STANDARD ERROR OF ESTIMATE OF LEVEL I SCORES FOR EACH 
Leven Ц QuARTILE* 
Test Level П Quartiles 
Level І Level П 1 2 3 4t Е тах} 





DS PPVT 1858 2029 22411 1879 1-42 
DS SPM 20-76 18-40 18480 20:59 1:27 
P-A PPVT 5:35 553 472 543 137 
Р-А SPM $34 526 555 488 1:29 


* М = 253 low SES children. t 4th is upper quartile. { No value is significant at P < 0-05 level. 


variance indicates that the variances of the Level I scores for each Level II quartile 
do not differ significantly. Admittedly, Hartley's test is relatively low powered, but 
the values of F max are well below the value required to be significant. Moreover, 
inspection of the data reveals no evidence of a non-significant trend in the direction 
predicted by the two-level theory, that is, for the variance in Level I ability to decrease 
systematically when going from low to high scores in Level II ability. 


DISCUSSION 


In only one respect do the results support the two-level theory: measures of 
Level П ability are more strongly associated with SES than are measures of Level I 
ability. Moreover, most of the association between measures of Level I ability and 
SES is accounted for by intercorrelation between measures of Level I and Level Ц 
ability. These results support Hypothesis 1; the other three hypotheses were not 
supported. 


The size of the correlation between SES and Level II ability in this study is 
relatively small but comparable with the results of studies in America using similar 
indices of SES. Jencks (1972) reviews evidence that the relationship between family 
SES measured by children's reports of their parents' occupation is typically 0-35 for 
verbal intelligence and 0-30 for non-verbal intelligence. Jencks’ figures are only 
slightly larger than the figures reported here, and the discrepancy is a consequence 
of his use of correlations corrected for attenuation. Jencks also notes that it is 
commonly believed that the relationship between test scores and SES is greater in 
America than in most other comparable countries as a result of greater environmental 
variation in America, The present data do not support this belief. It should be 
noted that the magnitude of the relationship between SES and ability reported here 
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is less than that usually found when multi-dimensional indices of SES are used. 
These take into account additional variables such as the cultural level of the home, 
parents’ educational level and income, and other factors that correlate with intelligence 
(Shaycroft, 1967), but they introduce a problem of ambiguity in interpretation. 


The most marked difference in degree of association with SES did not separate 
Level I tests from Level П; it separated the verbal measure of Level 1 ability from 
both the Level I tests and the non-verbal measure of Level П ability. Since Jensen 
regards non-verbal tests as more pure measures of Level II ability than verbal tests, 
this result appears to reduce the already small degree of support provided for the 
theory. The results, however, conform with much other evidence which suggests 
that verbal tests of intelligence are more closely associated with SES than are non- 
verbal tests (Jencks, 1972). 


The results for the P-A test suggest that in the present study it is as good a 
measure of Level I ability as the DS test which is Jensen's preferred measure of Level I 
ability. The P-A test does not correlate more highly than DS with SES or with the 
Level П measures. The probable reason for this result is that the test was given under 
relatively speeded conditions, since there is evidence that when speeded the test 
measures only Level I ability (Jensen, 1969). Also, the items were composed of 
concrete rather than abstract words, and there is some evidence to suggest that 
learning pairs of concrete words tends to be independent of measures of Level II 
ability, whereas learning abstract pairs is correlated with them, although only to a 
small degree (Feldman et al., 1972). 


Jensen's (1970) predictions regarding different patterns of skewness for Level I 
and Level II abilities in middle and low SES groups are not supported by the data. 
Reasonable though these predictions appear given the assumption that intelligence is 
normally distributed in the population as а whole and given the evidence that the 
mean for middle SES children is higher than that for low SES children, no convincing 
evidence to support them has yet been presented by Jensen. In the present data, 
only the PMS scores are consistently skewed (negatively) to more than a slight 
degree. Negative skewing appears characteristic of Raven's Matrices scores for 
children, adolescents, and adults (Raven, 1959). Raven himself denied that such 
skewing is a test artefact and suggested that intelligence itself is negatively skewed 
rather than normally distributed in the population. 


In its original form, Jensen's theory postulated an implicative, hierarchical, 
* necessary but not sufficient > relationship between Level I and Level II ability as a 
result of which the development of Level II ability is dependent upon the prior 
development of a sufficient amount of Level 1 ability. Hypotheses 2 and 3 are pre- 
dictions derived from this postulated relationship, and they were not supported. 
The * necessary but not sufficient > relationship has had a chequered history in Jensen's 
theory. Jt was first advanced to explain existing data, and then abandoned when 
further studies failed to support it. Later, Jensen (1974) returned to it with the 
suggestion that it may be supported when Level П ability is measured by tests of 
* fluid intelligence ’ (Cattell, 1963), of which the PMS is an example. The results of 
the present study fail to support this suggestion. 


Having abandoned the ‘ necessary but not sufficient’ hypothesis, Jensen then 
proposed instead that Level II ability is functionally dependent upon Level I ability, 
and that the degree of functional dependence is only slight (Jensen, 1973, 1974). 
It is not altogether clear why Jensen should assume that the degree of functional 
dependence between the two levels is slight. Problem-solving processes appear 
substantially dependent upon ability to retain in memory, for at least a brief period, 
information relevant to the problem. For example, Horn (1970) refers to evidence 
showing that difficulty in performing even rather simple reasoning tasks relates to 
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inability to hold several elements within immediate memory. In saying that the 
degree of functional dependence of Level П ability on Level I ability is slight, Jensen 
may simply be recognising the fact that most people possess sufficient Level I ability 
for variation in Level II ability to be substantially independent of variation in Level I. 
If so, then it is more appropriate to conclude that the functional dependence of 
Level II ability on Level 1 cannot be demonstrated in individual difference studies, 
except perhaps with subjects very severely handicapped in Level I ability. In this 
connection it is suggestive that some of the initial support for the hypothesis (Jensen, 
1963) came from a study which included mentally retarded subjects. 


In the present study, failure of the data to support most of the predictions drawn 
from the two-level theory lends support to the view that the implications of the theory 
for aptitude-treatment interactions are questionable, in particular the suggestion 
that Level ability could be more fully exploited in teaching low SES and disadvantaged 
children. While it is true that middle and low SES children differ in Level II ability 
and not in Level I, there is little to support the view that Level I ability is of much 
educational importance. Indeed, there is virtually no evidence from individual 
difference studies to suggest that a general ability factor corresponding to Level І 
processes can be identified; instead, the Level I domain appears to comprise several 
narrow and relatively independent factors which are apparently of little educational 
significance. In this respect, Level I ability contrasts strongly with Level II ability. 
Use of rote memory processes without mediational techniques such as elaboration, 
transformation, use of verbal mnemonics or imagery appears to be both inefficient 
and general uncharacteristic of human memory processes (Lawson and Jarman, 
1977). In recent years, Jensen himself has concentrated his research on Level II 
ability rather than Level J, and his suggestion for greater use of Level I ability in the 
education of the disadvantaged was originally expressed only in somewhat general 
terms which have not been refined since. Just what the suggestion could mean in 
practice is not clear, particularly since Jensen (1970) specifically denied that it implied 
a return to old-fashioned, rote learning methods of teaching. АП this suggests that 
there is at present no reason to give much weight to the original suggestion. However, 
it should be noted that this does not constitute an argument against the use of memory 
processes in education; instead, it simply acknowledges the evidence indicating that 
human memory is most efficiently employed when information to be learned is 
manipulated and transformed in ways that Jensen characterises as Level II. 
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SUMMARY. Following Hunt's (1975) analysis of the Raven's Progressive Matrices 
(RPM) test, an attempt was made to identify and train two strategies for solving RPM 
problems. Four training groups were formed: one received Gestalt strategy training, 
one received Analytic strategy training, one received training in both strategies, and a 
control group received neutral instructions. Following training, subjects completed a 
set of ambiguous items and Set I of the Advanced Progressive Matrices. In the latter, 
they were required to justify their responses. Evidence is presented which suggests 
that the two strategies proposed in Hunt's analysis can be identified and that subjects 
can be trained to use and maintain these strategies, Strategy use is also related to post- 
test performance. Analysis of subjects" justifications for responses indicates that Hunt's 
item analysis of certain RPM problems needs to be modified to take into account both 
the influence of item-type and subjects’ flexibility in switching between strategies even 
after training. 


INTRODUCTION 


WHILE the concept of general intelligence still has broad support, there are increasing 
efforts to understand it in terms of more discrete and more easily defined cognitive 
processes. Humphreys puts this position clearly in his discussion of the construct of 
general intelligence: 


“ Intelligence is the resultant of the processes of acquiring, storing in memory, 
retrieving, combining, comparing, and using in new contexts information and 
conceptual skills ..." (1979, р. 115) 


Currently attempts are being made to identify the component cognitive processes 
which underlie psychometric test performance (e.g., Carroll, 1976; Sternberg, 1977; 
Das et al., 1979). One of the tests which has been studied in this regard is Raven’s 
Progressive Matrices (Raven, 1938). In this paper we attempt to identify two solution 
strategies which differ in their underlying cognitive processes, and to examine the 
degree to which these strategies are subject to instruction. 


Raven's Progressive Matrices (RPM) is widely regarded as a test of general 
mental ability. It has been used by Cattell (1971) as a measure of fluid intelligence 
and by Jensen (1970) аз an index of Level Ц ability. It is also commonly used in 
education as a test of general intelligence or non-verbal reasoning. Historically the 
test seems to have emerged as an attempt to assess Spearman’s g, ‘ the eduction of 
relations and correlates '. 


Analyses of RPM items 

On the basis of a logical analysis Hunt (1975) suggested that, for a large number 
of RPM items, there were two quite different solution algorithms. One was described 
as the Gestalt algorithm, which “ deals with а problem by using the operations of 
visual perception, such as the continuation of lines through blank areas and the 
superimposition of visual images upon each other " (p. 133). The analytic algorithm, 
on the other hand, “ applies logical operations to features contained within elements 
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of the problem matrix" (p. 133). Whereas the Gestalt algorithm relies upon the 
mental manipulation of sensory images, the Analytic algorithm deals with abstracted 
features of the displays, by means of operations such as constancy, supplement/ 
delete, expansion/contraction, addition/subtraction, movement, and composition/ 
decomposition (Hunt, 1975, pp. 146-147). While the Gestalt algorithm is seen as 
less developed than the Analytic one, Hunt proposes that use of the Gestalt algo- 
rithm alone in Set I of the Advanced Progressive Matrices would result in a score 
“slightly below average performance in the normal adult population” (p. 141). 
Similarly, inspection of the Coloured Progressive Matrices (suitable for elementary 
school children) suggests that the Gestalt algorithm could solve almost all of these 
items. In all cases the Analytic algorithm is required for solution of the most diffi- 
cult items. 


The distinction between Gestalt and Analytic algorithms recalls the factor 
analytic studies which have examined the loading of RPM items on verbal and visual/ 
spatial factors (see Raven et al., 1977, p. SPM10). While these studies have con- 
sistently shown the involvement of visual/spatial processes in the solution of RPM 
items, the role of verbal processes is less clear. Burke and Bingham (1969) have 
suggested that the influence of verbal processes is apparent in that subjects " talk their 
way ' through RPM. items (p. 251). In addition Bock (1973) has hypothesised that 
there is significant verbal content in the Raven's test. 


However, Hunt's analysis is not simply a restatement of the above arguments. 
His position and that derived from the factor analytic studies are not isomorphic: 
for instance, both of Hunt's algorithms could be employed verbally while involving 
different processes. In addition his analysis does provide & more precise specifi- 
cation of the different types of processes involved in RPM performance. 


If supported empirically, Hunt's analysis could have far-reaching implications. 
Once the possibility of subjects' using different strategies or algorithms to solve a 
particular set of test items is accepted, then interpretations of test performance 
must become more complex. The total score will not tell us much about the way in 
which the individual solved the RPM test items. This leads, as Hunt suggests, to a 
consideration of an individual's cognitive style in any interpretation of RPM test 
performance. More specifically it is possible that children who score low on RPM 
are not spontaneously employing the optimal (i.e., Analytical) algorithm. If such 
were the case then any assessment of a child's general mental ability would need to 
consider whether the failure to use the optimal strategy was due to an inability to use 
it effectively —a mediation deficiency in Flavell’s (1970) terms—or because they were 
simply not aware of the strategy but used it effectively following instruction. The 
latter position represents a production deficiency as described by Flavell. 


What evidence is available for Hunt's algorithm approach? Little direct evi- 
dence apart from Hunt's original paper has emerged, although there are several 
sources which provide indirect support. For instance Corman and Budoff (1974a, 
1974b) factor analysed responses on the Coloured Progressive Matrices items for a 
variety of populations and found consistent support for the existence of four subscales: 
Discrete Pattern Completion, Simple Continuous Pattern Completion, Continuity 
and Reconstruction of Simple and Complex Structures, and Reasoning by Analogy. 
The first three of these would appear to involve components of the Gestalt algorithm, 
while the last would generally require the Analytic algorithm. 


Kirby and Das (1978) attempted to demonstrate that those who successfully 
completed the coloured RPM Reasoning by Analogy items were also more able 1n the 
area of analytical (reasoning) ability, and that those who were more successful on the 
Continuity and Reconstruction of Simple and Complex Structures items would do 
better on measures of spatial ability. In a sample of 9-year-old children, these hypo- 
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theses were not found to be tenable. While this result does not disprove Hunt's 
position, it does not offer it the support it might have. 


Lawson and Kirby (1978) examined the role of early items in the development of 
the Gestalt and Analytic strategies. While grade 4 and grade 6 students showed no 
effects of exposure to items which encouraged development of the two strategies, grade 
$ students who had experienced the Analytic items did significantly outperform those 
given Gestalt items. 


The studies described above were not designed to establish Hunt's alternative 
solution algorithms, and can only be interpreted as weak support, or nonsupport, 
for them. Clearly what is needed is a study designed specifically to isolate these two 
algorithms or strategies. 


Identification of strategies 

The most straightforward approach to the isolation of strategies is to question 
subjects about their reasons for each response. While possessing all the difficulties of 
introspection, such a technique is not completely without value. It has been applied 
extensively in information processing analyses of cognition (Ericsson and Simon, 
1978), in Piagetian research, and more recently in investigations of metacognition 
(see Lawson, 1980, for a review). 


А second approach is to examine the patterns of subjects’ responses and to relate 
these to verbal justifications. If items can be identified that clearly require one 
strategy and not the other, then success on those items can be taken to indicate that the 
strategy was used. With regard to the RPM this technique could identify use of the 
Analytic algorithm, but not the Gestalt, as all items solvable by the Gestalt algorithm 
are also solvable using the Analytic. 


In the traditional RPM items use of a further strategy identification technique 
which is based on choice of option within an item is not available, as the design of 
response options does not permit clear identification of Gestalt or Analytic alternatives. 


The final technique to be discussed employs strategy training. If the different 
algorithms exist as different strategies, instruction in use of a particular one could 
increase use of that algorithm. This increased use of a particular algorithm could be 
detected by analysis of both justifications and response patterns. 


In this study different groups of subjects will be given different forms of training 
and the effects of this training will be examined. To assess the effects of this training 
in more detail, and to relate the different forms of training to the strategies proposed 
by Hunt (1975), subjects will first complete an Ambiguous Items test. In this test 
choice of options from among a number of ‘correct’ alternatives can be related 
directly to strategy training. This test is designed to allow for identification of 
strategy in a way that the RPM does not, i.e., through analysis of the patterns of choice 
of particular alternatives to matrix problems. For the RPM problems used in this 
study, because the design of answer options is not systematic, the indices of strategy 
choice will be response patterns and justifications. 


The purpose of this study is to attempt to identify Hunt’s two processing algo- 
rithms as solution strategies in RPM items, by means of the techniques described 
above. More specifically our aims are to (1) see whether two distinct strategies can be 
isolated using both verbal justifications and response patterns as indicated, (2) see 
whether these strategies can be trained and maintained, (3) investigate how strategy 
use is related to performance both overall and on specific items, and (4) relate these 
results to Hunt's analysis. Because the algorithms may be related to developmental 
stages, subjects of an age at the beginning of formal operations were most appropriate 
for the study. 
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METHOD 
Subjects 
The subjects were 80 sixth-grade boys attending a metropolitan primary school 
in Adelaide, Australia. The boys came from each of the school’s grade 6 classes and 
represented a broad range of ability. The mean age of the boys was 129 months 
with a standard deviation of 5:6 months. Subjects were randomly assigned to one of 


four training groups. 


Training 

Four instructional conditions were designed for this study. The G and A соп- 
ditions were designed to train the use of the Gestalt and Analytic algorithms re- 
spectively. А third condition with Dual (D) instructions was intended to provide 
training in use of both Gestalt and Analytic algorithms. The final, Control (C) 
instructions were designed to provide a neutral training for one group of subjects, 
similar to the standard RPM instructions. 


In the С instructions subjects were told: " We are going to do problems in 
which you have to work out what is missing from a picture. Each problem is basically 
& pattern with a piece missing. You have to pick out a piece to put in that space 
so that the picture or the pattern is finished. Look at each of the pictures and 
try to pick out the missing pieces. You have to pick the piece that completes the 
picture—that makes it look like a good pattern." Following this general introduction, 
subjects were taken through items Al, B2, Ab7, Ab9, Ab12, B7, B8, C4 from the 
Coloured Progressive Matrices and Standard Progressive Matrices. For each of these 
items the above training was repeated and subjects were given feedback after each 
item to ensure that they had used the appropriate strategy. In addition, subjects were 
given one specially constructed ambiguous item (see Figure 1). Again, use of the G 
strategy was stressed and choice of Є options emphasised. 


The same format was used for the А training with appropriate modifications. 
'The general instructions to subjects indicated that: " We are going to do problems in 
which you have to work out what is missing from a picture. For each problem there 
is a rule which tells you what should be in the empty space. What you have to do is 
work out the rule and then work out what the missing piece is. First try to work out 
the rule and it will help you work out what the missing piece 15." Then the practice 
items were administered with emphasis placed for each item on the application of the 
Analytic algorithm. 

The D training group received both the above sets of training, along with the 
following additional instructions: “ We have found that there are two good ways to 
solve these problems. Sometimes a good way is to pick out a piece to put in the space 
so that the picture, or the pattern, is finished. You pick out the piece that completes 
the pattern that makes it look a good picture (the first training item discussed here). 
Now sometimes this completing of the pattern doesn't work very well and instead we 
have to try to work out a rule for the problem. (Application of Analytic algorithm 
discussed.) So with these problems you can use two ways to solve them. You look 
at each problem and decide how you can do it—either complete the pattern or work 
out the rule. On some of the problems you might see that you can use both ways. 
You choose whichever way you think is best for you." 


The final training group, the C group, received instructions which did not stress 
any specific solution strategy. “ With these problems we have to find out which one 
of these (alternatives) goes in this space. You have to pick out which one goes into 
this missing piece of the problem." Subjects were then given the first two training 
items. The discussion of these training items did not involve advice on application of 
a solution strategy. 
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FIGURE 1 
FIRST ÁMBIGUOUS TRAINING ITEM 


Post-test measures 
Following completion of the training session subjects were given the post-test 
items. These items were the same for all groups. 


The first set of items were those in the Ambiguous Items test. These were in- 
tended to provide a direct index of strategy use. In each of these nine items at least 
one of the options was designed to satisfy the Gestalt strategy, and one was compatible 
with use ofthe Analyticstrategy. Choice of one of these options was taken as evidence 
of use of the particular strategy. 


'The second post-test was Set 1 of the Advanced Progressive Matrices. These 12 
items had been analysed by Hunt (1975, Table 6.1) who suggested that while items 
1-6 could be solved by use of either algorithm, items 7-12 required use of the Analytic 
algorithm for solution. 


Following completion of each item in both of these tests, subjects were asked to 
say why they had chosen a particular option. 
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Procedure 

АП training and testing was carried out individually by one experimenter. 
Subjects indicated their;choice of options on an answer sheet and their verbal justifica- 
tions were tape-recorded during performance of the Ambiguous and Set 1 items. The 
complete training and testing session lasted approximately 40 minutes for each subject. 


Coding of justification responses 

Following the completion of data gathering the subjects’ justification responses 
were transcribed for analysis. Each response was then classified using a three-part 
coding scheme. Justifications were classified as G if subjects stressed any of the 
following factors: the spatial nature of the array; the continuation of elements of the 
array; ‘ colouring-in’ of parts; or the visual appearance of the chosen option (e.g., 
the balance of symmetry or “ It's getting bigger as it goes across ”, “It goes skinny 
then a bit wider "). Justifications were classified as A if they stressed rules, such as: a 
counting rule (“ this one has 3, this опе 2, so this must have 177); a position rule (“ top 
left, top right, bottom left, so it must be bottom right "); or a subtraction rule. 


Justification responses were classified by two raters, neither of whom knew the 
subjects’ training group membership. Percentage agreement was 80 per cent and 
Cohen’s coefficient of agreement (K) attained a value of 0-65 with a maximum K value 
of 0:87 (Cohen, 1960). If no agreement could be reached about a response, or the 
response was tautological or irrelevant (“ Because its got all little squares in it and 
there's no other like її”), it was placed in the Unclassifiable (U) category. 


RESULTS 
Ambiguous items 

The mean level of choice of G and A options for the four groups is set out 
in Table 1, and for both sets of scores the means of the groups were significantly dif- 
ferent (G-score: F (3, 76) = 45-8, P<0-001; A-score: F (3, 76) = 43-0, P<0-001). 
Post-hoc comparisons using the Newman-Keuls procedure indicated that for the 
G-score the Gestalt and Control group means were significantly higher than those of 
the Dual training group, which was in turn significantly higher than the mean G-score 
of the Analytic training group. As might be expected, the pattern of the post-hoc 
comparison results was exactly reversed for the A-scores. 

The frequency of A or G responses for each of the Ambiguous items is set out in 
Table 2. It can be seen that the Gestalt and Control training groups responded almost 
exclusively in а С fashion, 1.е., they chose options which completed or balanced the 
pattern in the problem matrix. The Dual training group does appear to have respon- 
ded more flexibly. Although these subjects did choose more G than A options, they 
chose more A options than either the Gestalt or Control groups. 


TABLE 1 


AMBIGUOUS TEST ITEMS: MEAN NUMBER OF GESTALT AND ANALYTIC 
NSES IN THE FouR TRAINING GROUPS 


Training Group 








Gestalt Analytic Dual Control 

(N=20) (М= 20) (N=20) (М - 20) 
G-score* 8-7 27 5:85 8-15 
A-score 0-05 57 2:65 0:35 


ж G-score is mean number of G-options chosen. A-score is mean number of A-options chosen 
Maximum possible score in both cases is 9. 
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TABLE 2 
AMBIGUOUS Test ITEMS: FREQUENCY OF GESTALT OR ANALYTIC 
RESPONSES 


Training Group 











Test Gestalt Analytic Dual Control 

Item* (М = 20) (N = 20) (М = 20) (№ = 20) 
1.G 20 1 6 i7 
A 0 19 14 2 
2.G 20 0 16 19 
A 0 20 4 0 
3. а 20 2 9 20 
А 0 18 10 0 
4. а 18 2 7 15 
А 0 17 11 1 
5. С 20 6 17 20 
А 0 14 3 0 
6. а 18 5 17 19 
А 0 10 3 1 
7. а 19 7 3 19 
А 0 12 16 0 
8. С 20 9 14 17 
А 0 8 6 2 
9. а 18 17 16 17 
А 1 2 4 2 


* С indicates frequency of choice of Gestalt option. А indicates frequency of choice ої Analytic 
option. Some options were appropriate to neither algorithm, so all cells do not add to 20. 


The pattern of responses in the Analytic training group is less straightforward. 
Overall there is a clear preference for the A options and choice of A options for this 
group weakens in the latter items in this test, items in which the rules are more complex 
than in the first four items. Item 9, for which G-choices were dominant in all groups, 
is shown in Figure 2. In this item option 1 is the Analytic response (having 4 points) 
and either 2 ог 3 can be regarded as Gestaltic choices (being coloured in). It seems 
that when faced with a difficult rule-derivation task or one in which there is conflicting 
information, the subjects given Analytic training (and Dual training) reverted to a 
Gestalt response. 


Table 3 shows the relationship between choice of option and type of verbal 
justification provided for each item. For most items across all four groups the 
justification accords with the option choice, 1.е., G-choices are justified using G- 
related criteria. There are, however, instances where choice and justification do not 
agree. Item 9 provides the most obvious case of this disagreement: in the two groups 
which received Analytic training there appears to be some confusion about the basis 
for choice of a particular option. It should also be noted that this disagreement 
between choice and justification is not limited to the Analytic training, for on Item 9 
some of the Gestalt training group supplied Analytic justifications for their G-choices. 
With these exceptions the degree of correspondence between choice and justification 
is high. 


Set 1 items 

The means and standard deviations for the Training Groups’ Set 1 performance 
are given in Table 4. А one-way ANOVA indicated that the group means were signifi- 
cantly different (Е (3, 76) = 3-18, Р« 0:03). Post-hoc comparisons of means using a 
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FIGURE 2 
NINTH AMBIGUOUS Test НЕМ 


Newman-Keuls procedure indicated that the means for the Gestalt, Analytic and Dual 
groups did not differ significantly (P> 0-05), and that those for the Dual and Analytic 
training groups were significantly higher (P<0-05) than the mean for the Control 
group. The means for the Gestalt and Control training groups were not significantly 
different. Given this pattern of results it seems that the Analytic training did confer 
an advantage on subjects in the Analytic and Dual training groups, and that this 
bien e use of the Analytic algorithm was carried beyond the Ambiguous items into 
the Set 1 items. 


Further analysis of the Set 1 data (see also Table 4) shows that the groups did not 
differ over the first six items (those that Hunt argued could be solved by either algo- 
rithm) (Е (3, 76) = 1:99, Р> 0:10). The groups did differ in success in the last six 
items (Е (3, 76) = 2-73, P < 0-05), those said by Hunt to require the Analytic algorithm. 
This effect is due to the difference between the Analytic and Control groups (Newman- 
Keuls, P<0-05). While the difference between the Analytic and Gestalt groups is in 
the direction predicted by Hunt it does not reach significance at the 0-05 level. 


Details of the pattern of correct responses and of the justification given for these 


М. J. LAWSON and J. В. KIRBY 


TABLE 3 


AMBIGUOUS Test ITEMS: RELATIONSHIP OF CHOICE OF OPTION TO JUSTIFICATION 
OF THAT CHOICE 


Group | 
Gestalt Analytic Dual Control 


Justificationt Justification Justification Justification 
Response* а A U а AU GA U G A U 











1. G 20 1 6 1 17 
A 19 13 2 
2. G 20 5 15 19 
A. 15 14 
3. G 20 2 9 20 
А 18 10 
Ё 4G 18 2 7 15 
A 17 11 1 
5. С 20 6 17 20 
А 14 3 
6. G 18 5 16 1 19 
А 10 12 і 
7. а 19 7 16 19 
А 2 10 3 
8 G 20 8 1 14 17 
А 8 1 5 1 1 
9. С 14 4 5 102 10 6 15 
А 1 2 121 2 


* С and А indicate choice ої Є, ог А, options on each item. 
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‚1 G, А, and U indicate rating of subject’s justification according to criteria described in Method 


section. 


TABLE 4 
MEANS AND STANDARD DEVIATIONS FOR TRAINING GROUPS ON SET 1 ITEMS 


Training Group 


Gestalt Analytic Dual Control 











"Total test performance 7-4 (1-69) 8-25 (1-86) 84 (2-18) 6.6 (2:52) 


Performance on Items 


1-6 5-1 (0-96) 52 (115) 5-6 (0-99) | 4-75 (129) 


Performance on Items 


7-12 2-3 (1-22) 3-05 (1:35) 2:8 (1-64) 1-85 (1-53) 





Standard deviations are given in parentheses 


on the Set 1 items are provided in Table 5. Ті is clear that there are several items in 
which particular types of training appear to have given some subjects an advantage 
over others. In all cases where frequency of correct response of highest and lowest 
scoring groups diverge by 5 or more points, the advantage is held by one of the groups 
given the Analytic training. In the majority of such instances it is the Control group 
which fares worst though on items 3 and 6 (and to a lesser extent on 11 and 12) the 
Gestalt training group performs more poorly than either the Dual or Analytic groups. 
It is interesting to note that the Control group did, according to their justifications, 
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use ап Analytic strategy quite extensively on the later items in Set 1. This would be 
expected in a test which is regarded as a measure of general, or fluid, intelligence. 


Table 5 also provides information relevant for consideration of the relationship 
between training group membership and justification provided for correct responses. 
For the Gestalt training group the justifications provided are predominantly of a 
Gestalt nature, i.e., the subjects justified their choice of option in terms of completion 
or balancing of a figure in the matrix. In the Analytic group the pattern of justifi- 
cations is less straightforward. Clearly in the first five items of Set 1 the response 
justifications reflect mostly the use of Gestalt criteria. Apparently in these items the 
nature of the item is a more potent factor than the training. However, the pattern of 
justification changes markedly for the remaining items of Set 1: in these, subjects in 
the Analytic group (as well as those in the Dual and Control groups) provide mainly 
justifications which reflect the use of Analytic criteria. 


This analysis of item performance must be qualified by ТОП the somewhat 
different pattern of response-justification relationships for items б and 10. Item 6 was 
regarded by Hunt (1975) as solvable using the Gestalt algorithm. Yet in the analysis ~, 
presented in Table 5 it seems likely that a majority of subjects in all four training 
groups approached the item in an Analytic fashion. Furthermore, it is the subjects in 
the Analytic training group who performed most successfully on this item. 


Item 10 also represents a point of divergence from Hunt's classification of the Set 
1 items. While the Analytic, Dual and Control training groups provide Analytic 
justifications for their choices of the correct option, the Gestalt group does not follow 
suit. Inspection of item 10 (see Figure 3) and of the Gestalt group's justifications 


TABLE 5 


FREQUENCY OF CoRRECT RESPONSES FOR SET 1 ITEMS AND JUSTIFICATIONS PROVIDED FOR 
THESE RESPONSES 


Training Group 

















Gestalt Analytic Dual Control 
Set I Justification Justification Justification Justification 
Item С А U G A G A U G A U 
1 20 16 2 20 17 2 
2 17 1 6 11 3 16 2 16 
3 15 17 20 16 
4 20 18 19 16 3 
5 13 2 11 4 1 8 9 3 7 
6 6 8 5 14 17 1 12 
7 11 4 5 11 14 2 8 
8 6 2 8 1 9 1 4 
9 2 4 1 8 1 8 20 9. 
10 1 3 5 13 4 10 38 
11 1 14 2 1 1 1 
"12 1 1 5 6 4 1 
Total Items 
1-6 91 11 73 31 1 70 42 55 40 
Total Items 
7-12 30 14 2 12 49 8 48 8 27 2 


* Justifications are rated С, А ог U according to the criteria set out in Method section. 


\ 
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FIGURE 3 


у Trem 10 oF Ser 1 





reveals why this item is approached in this manner. Subjects who were correct 
apparently ignored most of the display and concentrated their attention upon the last 
row or column (e.g., “ As it goes down it gradually fills up till it fills right ир”). Thus 
contrary to Hunt’s view, it seems that item 10 is solvable using the Gestalt algorithm. 


If performance is compared across groups for items, 1, 2, 3, 4, 5 and 10 versus 
items 6, 7, 8, 9, 11 and 12 the pattern of significance found is similar to that indicated 
in Table 4, 


DISCUSSION AND CONCLUSIONS 
This study was designed to address four issues: (1) can two distinct strategies be 
identified? (2) can the strategies be trained and maintained? (3) is strategy use re- 
lated to performance? and (4) can these results be related to Hunt’s analyses? We 
will discuss each of these in turn. 
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Can two strategies be identified? 

It would appear that the answer to this question is * yes’. It was not difficult to 
follow Hunt’s analysis and devise brief training sessions. Similarly it was not difficult 
to assign unambiguously subjects’ responses to categories. Subjects’ choices in most 
of the ambiguous items were different according to training group membership. The 
pattern of responses and justifications for these, and in the Set 1 items, also varied 
with type of training. These findings suggest that strategy is not wholly dictated by 
item type and that subjects given different sets of training instructions do respond in 
identifiably different ways. 


The validity of our arguments for strategy identification depend of course on the 
criteria we have used. The results here give reasonable support to the usefulness of 
three main indices of strategy use. The levels of performance on Set 1 items indicate 
that training did affect performance in an overall sense though this is rather a gross 
index for identification of particular strategies. The second index, the ambiguous 
item performance, did provide a more specific check on the effects of strategy instruc- 
tion; at least for the first eight ambiguous items, training group membership is a good 
predictor of response. To this extent, training does seem to have encouraged subjects 
to adopt different strategies. The use of the ambiguous items allows analysis of the 
interaction between two components discussed in the introduction of this paper: 
training, and analysis of the subject’s pattern of response. Because of this the am- 
biguous items provide an advantage over the normal RPM items in which the design 
of the options for an item is often less than optimal for diagnosis of strategy use. It is 
often impossible to infer from errors on RPM items why a subject was not successful. 


The justification responses were used as a third source of information about 
strategies, and, as indicated in the analysis of the data presented in Table 5, they do 
appear to elucidate strategy use on particular items. When subjects’ justifications of 
responses are taken into consideration the interaction between training and item-type 
can be examined more closely. Thus it appears from Table 5 that there is a con- 
siderable degree of switching between A and G strategies by subjects within training 
groups. 

The justifications given by the Gestalt training group for their choices on items 6 
and 7 (see Table 5) provide examples of this strategy switching. On these items 
subjects appear to switch between strategies largely in response to differences in item 

However, the strongest support for the use of justification data emerges from 
an analysis of responses in item 10. For this item where the present results are in 
conflict with Hunt’s analysis the justifications given by subjects provide a possible 
means of resolving the conflict. 


It appears that subjects’ strategies did not persist in difficult problems (the ninth 
ambiguous item) or in problems which seemed to have a bias toward a particular 
strategy (Item 1 in Set 1). However, this is not surprising and in no way weakens the 
case for identifying the two strategy types. Subjects do not begin such tests with a total 
lack of knowledge about how to do them, and in any case would, from experience 
with the earlier items, learn ways of dealing with them. This is clear from the ‘ weak 
training ' results obtained by Lawson and Kirby (1978). 


Can the strategies be trained and maintained ? 

Again the answer would be a qualified * yes’. Subjects did behave differently 
and in the predicted directions following training. In this regard comparisons 
between training and control groups are instructive. Not only did the trained groups 
tend to perform better than the control group on Set 1 items, but the pattern of 
response of some training groups differed from that of the control group in ways which 
reflected specific training. Thus on the ambiguous items Control group subjects 
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selected mainly Gestalt options and in this way differed markedly from the subjects in 
the Analytic and Dual training groups. The justifications provided by the Control 
group reinforce the interpretation of performance on the ambiguous items. Yet on the 
Set 1 items the Gestalt and Control groups provide quite different patterns of justifi- 
cations. This suggests that the Gestalt training did exert a powerful influence on the 
Gestalt subjects to the extent that they maintained use of a Gestalt strategy (as 
reflected in justification data) on item 10 despite the fact that on two previous items 
(6 and 9) some of these subjects switched to more Analytic strategies. This pattern is 
not present in the data of the Control group. The fact that the Analytic training was 
successful seems clear from the overall success of the Dual and Analytic groups of Set 
1 items, and from the pattern of their responses on the ambiguous items. 


Maintenance of strategy use is indicated in similar patterns of response. On the 
basis of choice of options in the ambiguous item set, it seems reasonable to argue that 
subjects were using different strategies and were justifying their choice using different 
criteria. This differential strategy use does not appear to have been limited to the 
ambiguous items, for the groups differed in both the level of performance on Set 1 
items and in the patterns of response to specific items within that problem set. 


Longer training might well produce a stronger effect, though it would be difficult 
to overcome two biases: the apparent developmental trend from G to A thinking, 
and the dominance of the item type in problem solving. The only evidence of the 
developmental trend in the present study is the performance of the А groups on the 
ninth ambiguous item. Upon encountering a difficult item they appeared to ‘ regress’ 
to a G strategy. 


Item bias is also quite strong. If items have elements that vary in an obvious 
numeric fashion (e.g., three dots, then two dots, then one dot) it is difficult to imagine 
a way to stop 10-year-old children from counting and arriving at a simple rule. This 
is particularly so because the Analytic algorithm provides answers which are in agree- 
ment with those of the Gestalt algorithm. "Thus a child trained Gestaltically may still 
be attending to numerical rules during training; when able to respond freely, his 
mentioning of a numerical rule would have his response classified as Analytic. 


The influence of item type is apparent when subjects’ justifications of their 
responses are considered. If justification does index strategy use then it appears that 
subjects are quite flexible in using different strategies even though they have received 
special training at the start of a testing session. Since subjects do appear to be 
switching between alternative strategies, the influence of training instructions must 
not be over-estimated. In this regard the subjects’ justifications are an important 
aid in interpreting the effects of strategy training. 


Is strategy use related to performance ? 

This is an important issue which has not received adequate attention in the 
strategy literature. Table 5 above shows clearly not only that there are preferred 
strategies for most Set 1 problems, but also that a particular strategy (С for items 1, 3 
and 4, and A for items 6, 8 and 9) 15 overwhelmingly successful on specific items. 


In addition to these item-specific effects of strategy use, the level of performance 
of Set 1 items suggests that use of a particular strategy does affect overall degree of 
success on RPM. problems, especially the more difficult problems. In this respect it 
seems that Analytic training provides subjects with a significant advantage. 


The flexible training provided for the Dual group subjects did result in the best 
level of performance though this was only marginally superior to that of the Analytic 
group оп the Set 1 items. The Dual and Analytic groups did not, however, respond in 
identical ways, for the analysis of the responses to the Ambiguous items suggests that 
the Dual training group were more flexible in their responding. They chose more 
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Gestalt options and gave more Gestalt-type justifications than did the Analytic 
training group. 


Relation to Hunts classification 

While these results confirm Hunt's analysis to a great extent, they do suggest some 
modifications. Item 10 provides the major point of difference: our results suggest 
that this item may be solved using both Analytic and Gestalt algorithms, and that 
subjects may be using a variant of the super-imposition operation described by Hunt. 


Responses to both items 2 and 6 also depart from Hunt's analysis. While they 
may both be soluble by the Gestalt algorithm, only a minority of subjects outside the 
Gestalt trained group (of those who were correct) did use that strategy. Both items 
have elements that have small numbers of dots or lines, and subjects generally counted 
these and formed a rule. 


There is also weak support in these data for Hunt's contention that the algorithms 
are developmentally related. In the ninth ambiguousitem, subjects trained analytically 
did switch to a Gestalt strategy, even though they later returned to an analytic one. 


RPM and the measurement of general intelligence 

Hunt's (1975) identification of two solution strategies led him to question the 
usefulness of RPM as a measure of general intelligence. While we sympathise with his 
doubts, the existence of different solution strategies does not lead us to dismiss the 
RPM as a measure of g. Rather we would suggest that alternative solution strategies 
are quite compatible with Humphreys' definition of g given at the beginning of this 
article. The two algorithms represent two different ways of " acquiring, storing in 
memory, retrieving, combining, comparing, and using . . . " information (Humphreys, 
1979, p. 115). 

In fact we would suggest that Humphrey's list of characteristics of g should be 
expanded to include a metacognitive component which would encompass the spon- 
taneous strategy choice and appropriate strategy modification observed in this study. 
While no explicit attempt is made to assess these metacognitive features in traditional 
tests of g (just as no explicit attempt is made to assess many of the features identified 
by Humphreys) they can be seen to be involved in all but the simplest of cognitive 
measures. 


Conclusions 

It would be an understatement to describe the RPM as a complex test. Hunt 
(1975) has argued that there are two types of strategies or algorithms for the different 
items. Ina previous study (Lawson and Kirby, 1978) we showed that simply manipu- 
lating the early items completed could change some subjects’ performance on RPM 
items. The present study has experimentally verified the existence of the two strategies 
and has shown that subjects can, to a certain extent, be trained to use them. We have 
also argued that the use of the strategies can be maintained across two sets of post-test 
items within which strategy use is related to both overall performance and to perform- 
ance on specific matrix problems. The individual items, however, were found to have 
a powerful effect upon the strategies subjects adopted. Thus both training and item 
type influenced the strategy chosen, which in turn was vitally related to subjects' level 
of performance. 
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THE EFFECTS OF ITEM BY ITEM FEEDBACK GIVEN 
DURING AN ABILITY TEST 


Ву C. WHETTON AND R. CHILDS 
(National Foundation for Educational Research) 


Summary. Answer-until-correct (AUC) is a procedure for providing item by item 
feedback during a multiple-choice test, giving an increased range of scores. The 
performance of a group of children on an ability test using AUC procedures was 
compared with a group using conventional instructions. AUC scores considerably 
enhanced reliability but not validity. A comparison of item analyses for the two groups 
showed no evidence of learning during the test. This suggests that attempts to derive 
measures of ‘ability to learn’ which are based on changes in responding during the course 
of a test administered with feedback should be treated with caution. Evidence should 
be provided that similar changes in response pattern do not occur in a conventionally 
administered test. 
INTRODUCTION 


THE traditional method of answering multiple-choice tests has often been criticised 
for yielding too little information about the testee. There have therefore been many 
efforts to go beyond the simple right or wrong scoring usually used in such tests to 
gain further information using more refined techniques. 


Using conventional pencil and paper tests there have been three main attempts 
at obtaining extra information. These are (1) differential weighting (each option is 
weighted so that the more incorrect choices are more heavily penalised), (2) confidence 
weighting (the testee is required to indicate his confidence in all the options and is 
scored accordingly), (3) elimination scoring (the testee eliminates those options he is 
gure are incorrect leaving one or more which he considers may be correct). The 
rationale behind all these is that learning is not a simple all or nothing process, but 
that the subject may possess a partial knowledge of the answer to any question and 
that this can be measured by these methods either singly or in various combinations. 


These methods have proved in practice to be disappointing. It appears that test 
reliability may be slightly enhanced but that validity may be reduced. Hakstian and 
Kansup (1975) after a large-scale comparison of methods of assessing partial know- 
ledge concluded that, “. . . in terms of current methods of implementing it and common 
scholastic criteria, confidence testing, like elimination testing, appears to have little 
to recommend it over conventional testing ". 


Recently, however, a further method which attempts to assess partial knowledge 
has become available. This is based on the provision of feedback to the subject 
during the test, providing him with information about his performance on each item. 


It is many years since Pressey (1926) first suggested the provision of feedback 
during a test would increase the utility of that test as a means of instruction. Pressey 
(1950) developed a technique whereby a multiple-choice test item could be immediately 
scored and the pupil immediately told the correct answer. This rapidity of knowledge 
of results is beneficial to the pupils’ learning of the material being tested (Annett, 
1969). Pressey also suggested that the procedure improved the test itself, making it 
more reliable and its results more valid. These advantages might arise for two quite 
separate reasons. The first explanation could be that the range of scores per item is 
increased, the second that partial knowledge is being tapped. 

The examination of the supposed advantages of feedback during a large-scale 
group test has had to wait for a workable technology. Two main methods are now 
available, the first involving the removal of an opaque substance (carbon or plastic) 
from the answers and the second involving the use of a special marker pen to reveal 
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an image printed invisibly. Studies of feedback during tests have fallen into two 
categories: firstly, those that vary * knowledge of results ", in order to investigate the 
effects on tests taken subsequently and, secondly, those designed to investigate the 
advantages of the various scoring meth. 

An example of an investigation of the former type is that of O'Neill et al. (1976) 
who compared the performance of four groups of students on an ability test. The 
groups were (1) no feedback, (2) feedback after the test, (3) complete feedback after 
each item (the correct answer was disclosed without any search), (4) complete feedback 
after each item with the student searching for the correct answer until it was found. 
АП feedback groups improved their performances when the test was immediately 
readministered; however, the increase did not differ between the two immediate 
feedback groups. This study suggests that feedback itself increases learning but that 
the form of the immediate feedback is not important. Another comparison has been 
between partial feedback (the subject discovers his answer to be right or wrong) and 
complete feedback (the correct answer is disclosed). Hanna (1976) compared the 
effects on post-test scores of both these types of feedback with no feedback. His 
feedback tests and post-test involved similar data interpretation exercises, the feed- 
back test being multiple choice and the post-test used a free-answer format (the 
student wrote in his answers). Hanna found both types of feedback enhanced per- 
formance on the post-test but there was no overall difference between the two types 
of feedback. Both these studies indicate that learning takes place during a test if 
immediate feedback is provided and further show that the form of the feedback is 
not important. 

An advantage of instructing the subject to use total feedback by searching for 
the correct answer 13 that it provides a range of scores. For example, a four-option 
multiple-choice item may be scored 3--2--1--0 if answered correctly on the first, 
second, third or fourth attempt respectively. "The scores derived from this procedure 
have been termed Answer-Until-Correct (AUC) scores. However, other methods 
can be used to score the data from tests using feedback and comparisons of these 
form the basis of the second group of studies. 


Several comparisons have been made between AUC scores and results scored as 
one mark if answered correctly on the first attempt, and no marks if more than one 
attempt was taken; this often is called the inferred number right (INR) score. This, 
therefore, involves a comparison of two scores obtained from one attempt at the test. 
Gilman and Ferry (1972) compared AUC scores with INR scores and showed that 
the AUC scores were substantially more reliable. Hanna (1974) found a higher 
internal consistency with AUC scores and also an increase in the validity of the results 
when AUC scoring was adopted. The gains in reliability associated with AUC 
scoring as compared with INR have been replicated by Hanna (1975) and Taylor et al. 
(1975). However, Hanna failed to replicate the increase in validity. 


None of these four studies compared AUC scores with actual number right scores. 
The actual number right, in this case, refers to the same test taken without feedback, 
that is ‘ conventional ’ test administration. Evans and Misfeldt (1974), however, did 
use actual number right scores to compare split-half reliabilities for two groups of 
26 students who took the same test under AUC or conventional directions. The 
reliability for the AUC scores was greater than for the conventionally instructed 
group. Hanna (1977), using somewhat larger numbers, demonstrated a highly 
significant increase in reliability with AUC directions and scoring as compared with a 
no feedback condition. The reliability was also greater with partial feedback but not 
significantly so. The validity coefficients of the three methods were, however, 
virtually identical. Finally Hanna and Long (1979) again found an increment in 
reliability when the same test was administered using AUC directions and scoring 
compared with conventional procedures. 
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These studies may be summarised as showing that AUC procedures increase 
reliability of attainment tests as compared both to the same tests given with feedback 
but scored conventionally (INR) or administered without feedback. А similar 
increase in validity has not been demonstrated. The studies further seem to show that 
feedback during a test manifests itself in an increase in post-test scores but there is no 
evidence of learning affecting scores during the feedback test itself. 


The first purpose of the present investigation was to examine the use of the 
Answer-Until-Correct technique using an established ability test. Previous demon- 
strations of increased reliability have always been with attainment tests, usually 
constructed especially for the purpose and therefore with unknown item character- 
istics. The ability test used was known to be well constructed and any increases in 
reliability due to AUC would have to be substantial to reveal themselves. The 
previous studies have shown both increases and decreases in validity measures and so 
in this study two external measures were sought. The first measure was a mathematics 
score obtained in а school examination which would enable an estimate of the con- 
current validity of the non-verbal test to be obtained in both its AUC and conventional 
forms. The second measure, a free answer verbal reasoning test, was chosen because 
it would help provide more detailed information about the effects of feedback on the 
relationship between different abilities. As such it was hoped that the measures would 
clarify some of the difficulties associated with the effects of AUC on validity. 


It has also been suggested that an advantage of feedback during an ability test 
as opposed to an attainment test may be that it is possible to derive measures of 
* ability to learn ' or ‘ adaptability’. These could possibly be inferred from changes 
in the pattern of responding during the test. Henning (1975), for example, has 
derived various measures of ability to learn for a test designed to be culture free, the 
Learning Ability Profile. However, she provides very little supportive evidence for 
her various scales and no evidence that learning takes place during her test. The 
examination of the effects on learning from feedback have been restricted to observing 
the increases in score upon readministration of a similar test (e.g., Hanna, 1975). 
Аз has been noted O'Neill et al. (1976) failed to find an increment in performance on 
an ability test administered with feedback but there was a rise in scores when the same 
test was immediately readministered. This suggests that answers to specific questions 
were being learned to improve scores on the post-test, a feat of memory rather than 
the discovery of a method of solving later questions in the test. However, the test 
used probably consisted of many different item types and would not lend itself very 
well to learning during the course of its administration. 

The ability test selected for the present investigation has many items of the same 
general type and should be amenable to learning within the course of the test, if 
feedback is provided. The extent to which feedback causes changes in subjects’ 
responses within an ability test was therefore to be examined, since any learning which 
occurs would be due to the testee adjusting his strategies for dealing with subsequent 
items. This can be investigated by means of item analyses performed separately on 
the feedback and non-feedback groups, in addition to a direct comparison of those 
groups. If ‘ability to learn’ and other measures are to be demonstrated then the 
provision of feedback should show a measurable effect on the item characteristics. 
If this is not so then any derived measures are based on the false assumption that 
Ба causes changes in the subjects’ pattern of responding during the course of 

e test. 
METHOD 
Subjects 

The subjects were 227 first year secondary school children aged between 11 years” 
10 months and 12 years 9 months. They were attending one of two comprehensive 
schools both in an urban setting. The children were tested in complete form groups, 
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the forms having been randomly assigned to each treatment condition. Both schools 
operated a policy of mixed ability form groups. Each experimental group consisted 
of two forms from each school, and comprised 110 children in the feedback condition 
and 117 in the non-feedback condition. 


Materials , 

А short 21-item verbal ability test in completion format was produced to be 
administered to both groups. The items were taken from an established verbal test 
and covered a wide range of difficulties. 


The non-verbal test was the NFER. DH test, a multiple-choice matrices-style 
test (Calvert, 1970). Owing to shortage of time the abbreviated 64-item version was 
used. This was administered under two conditions. 


(1) The conventional administration with instructions to select the best answer 
and fill in its letter on the separate answer sheet. This involves no feedback 
whatsoever. 


(2) The Answer-Until-Correct directions. The instructions were to select the 
best answer and, using a special marking pen, fill in the square under its 
letter on the answer sheet. А latent image would become visible and inform 
the child if the answer was correct. If the answer was not right he was 
instructed to try again. The instructions emphasised not to guess but to 
reread the question and think about it before making a second attempt. 
The children were told to continue till they found the correct answer but to 
fill in as few squares as possible. 


Procedure 

The two tests were administered in one sitting. Each group of children first 
took the completion verbal test. This took 10 minutes. They then took the non- 
verbal test under the appropriate conditions. Both tests were preceded by examples 
of the types of item to be encountered which were explained in full. The time allowed 
for the 64 items was 40 minutes. 


Only one school agreed to provide a set of examination results for mathematics. 
These were from a recently completed school examination. 


RESULTS 


The initial comparison of reliability was between the test administered normally 
and the INR scores of the feedback group. Reliability was calculated using the coeffi- 
cient alpha. This is a measure of the internal consistency of a test, and represents the 
expected correlation of one test with an alternative form containing the same number 
ofitems. It is also a reflection of the inter-correlation of the items in the test: when 
the items correlate together well the reliability is high. If coefficient alpha is squared 
the result is the theoretical correlation between the test scores and the true scores. 
For dichotomous data it is equivalent to the KR20 formula. Coefficient alpha is 
calculated for a K item test as: 


К (1 Хт 
кк = Ко oF 


where Zo? is the sum of the К item variances 
c? is the variance of the test scores. 


For the conventional administration group the reliability was 0-949 and for the 
feedback groups INR scores 0-942. There is no evidence here that the use of feedback 
increases reliability. The coefficient alpha for the AUC scores, however, was 0:982. 
The increase is not great in magnitude but is limited by the initial high reliability of 
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the test. It is, nevertheless, highly significant (P « 0-001) and using Cronbach's (1970) 
signal-noise ratio procedure can be shown to be equivalent to lengthening the test by 
two and a half times. 


Two measures were obtained to provide comparison of the validity of the feed- 
back and no feedback conditions. The first was the verbal reasoning test completed 
by all subjects and the second was the mathematics examination results which were 
obtained from one school only. The results are summarised in Table 1. They show 
that the effects of feedback and AUC scoring lead to no great changes in the correlation 
with mathematics scores (the difference is not significant). 


However, the correlations with the verbal test scores show a decline first with the 
use of feedback and a further decline when AUC scoring is adopted. The difference 
between the conventional administration condition and the AUC score condition is 
significant at the 0-05 level. 


The approach to the measurement of learning poses some difficulties and will be 
considered in two ways, firstly by a comparison of scores on the test and secondly by 
an examination of item difficulties derived from the item analyses. 


The INR scores for the feedback group were in fact lower than the scores for the 
conventional administration condition. The mean for the feedback group was 36:9 
and for the conventional administration condition was 41:9. This difference is largely 
accounted for by the longer time taken to complete the questions using Answer-Until- 
Correct procedures. Thus only 39 per cent of the feedback group completed all the 
items compared to 68 per cent for the conventional group. In order to compare the 
two groups, the subjects’ scores were recalculated from the first 40 items only. The 
percentages attempting all 40 items were similar with 80 per cent for the feedback 
group and 86 per cent for the conventional group. The mean INR score for the 
feedback group was 26-7 and for the conventional group 28-6. The difference is not 
significant and is in the wrong direction for a demonstration of learning from feedback. 


To confirm this apparent lack of learning it is necessary to show that the two 
groups were of similar ability and, if not, to allow for this. An analysis of covariance 
was therefore performed comparing the performance of the two groups for the first 
and second 20 items of the test. In order to obtain equal groups, subjects selected at 
random were discarded until the feedback and conventional groups each consisted of 
80 subjects, 40 from each school. 


The means and standard deviations of the groups are shown in Table 2. The 
decline in score for the two groups is similar and there is no evidence that learning is 
taking place in the feedback group. This was confirmed by an analysis of covariance 


TABLE 1 


CORRELATIONS BETWEEN EXTERNAL TESTS AND SCORES FROM DIFFERENT 
SCORING PROCEDURES 











No feedback 
Feedback group group 
AUC INR Conventional 
scoring scoring scoring 
Verbal test score 0 56 (110) 0-68 (110) 0.75 (117) 
Mathematics examination 
result 0-69 (43) 0:69 (43) 0:72 (57) 


Numbers in brackets give sample sizes. 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS OF SELECTED GROUPS FOR 
FIRST AND SECOND TWENTY ITEMS IN TEST 


First 20 items Second 20 items 








Feedback group Х = 13:9 Я = 12:3 

INR scoring SD = 42 SD = 48 
(№ = 80) 

No feedback group ž = 147 Я = 134 

ee scoring SD = 41 SD = 48 


67 subjects were discarded to give equal groups. 


which used the scores on the first 20 items to equate the groups and compared the 
scores оп the second 20 items. The difference was not significant F (1, 157) = 0-588 
(NS). 

The second approach used item analysis procedures to compare the difficulties 
associated with each item from the two samples. If learning is taking place with feed- 
back the later items should be easier in the INR analysis than in the analysis of the 
conventional results. Once again to eliminate the effect of items not reached the item 
analysis was performed on only the first 40 items. 


Figure 1 shows the facility value (proportion of subjects answering item correctly) 
for conventional and INR scoring for the initial and final ten items. If learning were 
taking place the INR facilities should be higher for the final ten items. This is not so. 
It may be that the slight ability difference between the groups is contaminating the 
facility values and masking a learning effect. 


In order to eliminate the difference between the groups an item analysis based on 
the Rasch model was performed (Wright and Stone, 1979). These procedures are 
claimed to give sample-free difficuity scalings (Tinsley and Dawis, 1975) which are 
derived from a logarithmic model of the individual's performance on individual items. 


The facility value for each item is converted to a difficulty value by taking the 
natural log of the proportion incorrect divided by the proportion correct. This 
converts the proportions to a new linear scale, which increases with the proportion 
of incorrect responses. The item difficulties are then adjusted so that their mean 
difficulty for the testis zero. In order to make the item difficulties sample-free, they 
are multiplied by an expansion factor derived from the spread of abilities in the sample. 
This is necessary because the more dispersed the abilities, the more similar in difficulty 
the items will appear. The expansion factor removes this anomaly. The result of 
these calculations is a logit value for the difficulty of each item which is sample free. 
It is therefore possible to compare the same items from the test taken with and without 
feedback. The logit difficulty for the first and final ten items are shown in Figure 2. 


If learning is taking place the final ten items would show lower difficulty values 
for the INR scores. Since the difficulty values are sample-free the slight ability 
difference between the groups is irrelevant. Once again there was no evidence that 
any learning was taking place in the feedback group. In fact, when the difficulties 
throughout the two item analyses were compared no difference for any item was 
found to be significant. 


It seems that despite the similar nature of the items and the provision of feedback 
during the test, subjects do not or cannot take advantage of this to enhance their 
performance on the later items in the test. 


Feedback During an Ability Test 
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DISCUSSION 


The enhanced reliability from the use of AUC scoring found by previous investi- 
gators (e.g., Hanna, 1977) was replicated. However, the generality of the finding has 
been increased since for the first time it has been shown for an ability test rather than 
a specially constructed attainment test. Moreover, the increment in reliability takes 
place even though the initial reliability of the test used was much higher than the 
tests used previously (e.g., Hanna and Long, 1979, showed an increase from 0-53 to 
0-64). 


The increase in reliability associated with AUC scoring means that it would be 
possible to shorten a test, retaining a high reliability, but saving time. This would 
have to be balanced by the increase in time per item which using AUC procedures 
causes. In the present case the gain in reliability is equivalent to a test two and half 
times as long, but the subjects were completing items at about 60 per cent of the normal 
rate. This indicates that some savings in time can be made through using a shorter 
AUC test, despite the longer time per item. 


The increment in reliability depends on the adoption of AUC scoring, not 
simply the presentation of feedback. Hanna (1974) suggested that the gain in 
reliability may have arisen from immediate feedback interacting with affective traits 
and, for example, adversely affecting the performance of anxious examinees. However, 
it appears that the increment depends not simply on the presentation of feedback but 
on the adoption of AUC scoring. It is difficult to understand why AUC scores should 
be more influenced by anxiety than INR scores if the anxiety is aroused by the feed- 
back which is present in both cases. 


The enlargement of the score range per item means that the sum of the variances 
of all the items is increased and also that the variance of the test scores is increased. 
Reference to the formula for coefficient alpha will show that in order to increase 
reliability the variance of the test scores must increase proportionally more than the 
sum of the item variances. For this to be true the second and subsequent choices 
must be more than random guessing as this would simply increase the item variances 
in the same proportion as the total test variance. The increase in reliability with 
AUC scoring therefore suggests that the opportunity for second and subsequent 
choices is reflecting partial knowledge and not simply random guessing. 


To summarise, AUC scoring would seem to be an efficient means of enhancing 
the reliability of a test by increasing the range of test scores to increase the assessment 
of partial knowledge during the test. 


The effect of AUC scoring on correlations with other measures is much more 
confused. Previously it has been shown to increase validity (Evans and Misfeldt, 
1974; Hanna, 1974), to decrease validity (Hanna, 1975) and to make no difference 
(Hanna, 1977). In the present case, correlations with a mathematics examination 
в unchanged but the correlations with а verbal intelligence test were considerably 
reduced. 


The decline in correlation with the verbal test can be seen as a positive gain from 
the adoption of AUC since it appears to have reduced the verbal loading of what 
purports to be a non-verbal test, while at the same time maintaining the correlation 
with an external measure, the mathematics test. 


This illustrates the care which must be taken in attempting to understand the 
effects of AUC on validity. The validity of the test may be enhanced by using AUC 
but only when the criterion is relevant to the adoption of AUC. "This might only 
be when the criterion itself requires some adjustment on the basis of feedback, as 
suggested by Hanna (1975). However, in most cases, if it can be demonstrated that 
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the adoption of AUC does not detract from а relevant validity criterion the enhanced 
reliability can be sufficient justification for its use. 


In one area the present study was totally negative: despite the careful choice of 
an ability test which allowed ample scope for learning, no learning of methods of 
solving items appears to have еар in the feedback group. This confirms the 
impression given by the O’Neill et al. study (1976). 


There are two types of possible reasons for the failure of subjects to learn during 
the test despite the provision of feedback. Firstly, it could be that the instructions did 
not sufficiently emphasise this aspect of the use of feedback. The emphasis was 
placed on re-examining questions where the wrong answer had been given, and this 
should have provided some information for later items of the same type. However, 
children of this age may not learn from experience unless overtly directed to do so. 
Despite the fact that the items are similar in nature and form, they may in fact be 
regarded by the children as completely separate entities. То examine this it would 
be necessary to be much more directive to the subjects in а feedback group, giving 
instructions on how to use that feedback. 


А second type of explanation may arise from the test itself. Perhaps the test did 
not, in fact, allow learning of concepts for answering questions, despite the similarity 
of the items. In addition, the high facility values of the items (for both groups) may 
mean that the level of learning is initially high and did not allow very much scope for 
improvement. To examine these it would be necessary to specially construct a test in 
which the items had low facilities and a common theme throughout the test. This is 
at present being undertaken. 


This lack of any learning casts doubts on the current movement toward measures 
of learning ability derived from change of performance within an ability test (e.g., 
Henning, 1975). Test constructors who develop such indices of learning should give 
demonstrations that learning is indeed taking place, and this should be done by a 
comparison with а non-feedback group. It is not sufficient to demonstrate a general 
increase in numbers passing later items, as this might be due to the items themselves 
being easier. Again this should be established by reference to the same items taken 
without feedback. 


То conclude, the study has demonstrated that the increase in reliability associated 
with AUC scores may be generalisable to ability measures, and is not confined to 
attainment measures alone. Further, the assumption that feedback allows learning 
to take place during a test and affects later items within that test was not supported, 
and it is suggested that tests which purport to give derived measures of the ability to 
learn should give clear demonstrations that some learning within the test is taking 
place. This should preferably be done by comparison with the same test taken 
without feedback. 
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PUPIL PERCEPTION AND PUPIL EVALUATION BY 
SENIOR SECONDARY TEACHERS IN RELIGIOUS 
AND COMPREHENSIVE SCHOOLS IN THE REPUBLIC 
OF IRELAND 


By RACHAEL M. HENRY 


(Department of Applied Psychology, University of Wales 
Institute of Science and Technology) 


SUMMARY. The application of Repertory Grid Technique to teachers’ perceptions and 
evaluations of senior secondary pupils in two urban religious schools and a rural 
comprehensive challenged traditional approaches underlying classifications such as 
*progressive' versus ‘traditional’ as representations of teacher outlook. It also revealed 
a generally low level of articulation and differentiation in those teachers' construct 
systems in the area of pupil perception. The analysis of individual grids failed to support 
the assumptions of intrapersonal and transpersonal meaning implicit in previous 
interpretations of construct frequency and rank data in this area. Finally, disjunctions 
between the Repertory Grid data and pupils' 16PF scores indicated possible constraints 
operating on the interpretation of Repertory Grid data. 


` INTRODUCTION 


AT the end of the 1960s, Musgrove and Taylor (1969) claimed that the role of the 
teacher in modern Britain had proved an inexhaustible subject for arm-chair theorising 
and inaugural lectures but that there had been virtually no relevant empirical studies 
on the subject. The most comprehensive historical analysis of changing attitudes to 
the role of teachers and to the proper discharge of their functions was carried out by 
Beilin (1959). Не examines а monograph published by E. К. Wickman (1928)— 
‘Children’s Behavior and Teachers’ Attitudes'—which contrasted teachers’ and 
mental hygienists' attitudes towards the behaviour problems of children, and pre- 
cipitated an assault upon the teacher's mode of dealing with children when it made 
evident that teachers' attitudes were widely at variance with those of clinicians. 
Wickman's results suggested that mental hygienists were primarily concerned with 
‘withdrawing’ and other non-social forms of behaviour in children of elementary 
school age, whereas teachers of the same children were more concerned with classroom 
management, authority and sex problems. "Teachers were urged to adopt a hierarchy 
of attitudes closer to that of the clinician; the failure to question the appropriateness 
for teachers of the judgmental criteria employed by clinicians is regretted by Beilin. 
His review establishes that in 1927 differences did exist between the attitudes of 
teacbers and clinicians towards the behaviour problems of children, and that between 
1927 and the late 1950s there had indeed been a shift in the hierarchy of teachers’ 
attitudes to approximate more closely to those of clinicians. This shift was more 
pronounced in elementary school than in high school, which Beilin sees as consistent 
with the inevitability and desirability of the principally task-oriented role of the 
secondary teacher. 


The evidence in Britain in the early 1960s conforms to Beilin's analysis and 
projection of a continuing influence of clinical psychology in educational spheres, 
broadening beyond the elementary school level. Wilson (1962) argued that the 
teacher's role must become more ‘ diffuse’ at a time when most professional roles 
were becoming more specialised and specific. Mays (1962) similarly argued that a 
teacher's role must broaden in scope, embracing ever more * parental functions and 
calling for the skills and interests of the social worker. Musgrove and Taylor (1965) 
depicted the argument as one ‘ between the ideas of the teacher as a pure inculcator 
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of knowledge and the teacher as a welfare worker'. This corresponds roughly to 
two educational views described by Dewey 50 years ago as ‘ traditional’ and ‘ pro- 
gressive’. It has subsequently emerged under various guises in the psychological and 
sociological literature: ‘ knowledge-centred teachers ’ versus ‘ child-centred teachers ' 
. (Barker Lunn, 1970), ‘task-oriented teachers’ versus f person-oriented teachers’ 
(Brophy and Good 1974), ‘instrumental’ versus ‘ expressive’ functions of school 
(Bernstein, 1966; Davies and Meighan, 1975). 


The question of educational goals is particularly important and acute in Ireland. 
The reform of education in the last 15 years has resulted in there being more people 
in full-time education per member of the work force in Ireland than in any other 
OECD country, a higher proportion of its teenagers in full-time education than in 
any country other than the United States and Sweden and a uniquely high proportion 
of its total government budget devoted to education. However, the degree of reform 
that has taken place is difficult to determine since the Department of Education has 
been markedly reticent in providing the public with records of its activities during 
the past 15 years, to the extent that there is serious speculation among educational 
observers that the Department may have quietly abandoned altogether this vehicle 
of public accountability (e.g., Akenson, 1975). 


The most important innovations in the 19605 came in the state financing of 
schools, the introduction of free post-primary education and the creation of new 
forms of post-primary institutions. The school leaving age was raised to 15 in 
1972-3, and the almost entirely linguistic curriculum of the secondary school markedly 
broadened. However, far from being revolutionary, the new financial arrangements 
reinforced the existing division of power between church and state, the church 
surrendering no powers of any significance in return for massive amounts of money 
(Akenson, 1975). For example, in potentially one of the most far-reaching of the 
government’s ideas for reform, the ‘ comprehensive’ school, the church’s powers 
have been preserved in the composition of the school management. The compre- 
hensive school has not come under any new form of national control, and has appeared 
only in areas inadequately served by existing secondary and technical institutions. 


At a very general level, the move towards the reform of secondary education in 
Ireland was quite explicitly concerned with the issue of economic growth (OECD, 
1969). Yet, despite the enormous investment of money that has been made in 
education, there has been even less research into the area of educational goals in 
Ireland than elsewhere. Reporting one of the few studies done on the subject of 
attitudes of secondary school teachers, Raven (1977) says: 


“In practice, if one asks people what the aims of education are, a very wide 
range of objectives are enumerated: development of character; willingness to 
work at boring and useless tasks (^ which they will have to work at all their lives '); 
unwillingness to tolerate unpleasant situations ... Christian morality; ability to 
express oneself; willingness to listen to and understand others; willingness to 
recognise problems; willingness to challenge what appear to be the authorities 
on a subject; critical thinking; good taste; self-confidence; maturity; leader- 
ship; changes in people’s views of themselves, their self-images; and тоа | 
pupils to view as appropriate to themselves certain social roles and vocations . 


Raven goes on to point out that 


“ what is significant about this list is the lack of unanimity between educational- 
ists, the varying degree of generality or inclusiveness, of the objectives as they are 
stated, the lack of relationship between these aims and the educational practices 
and procedures (inputs, learning experiences) which take up so much time in 
education and the absence of measures to assess the extent to which attainment of 
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these aims is achieved in the examinations used to classify people as educated or 
not, or to compare the adequacy of one school with another and one teacher 
with another. 

* Even less discussed than these positive objectives of education are the 
possible ill effects that education may have, effects which may have consequences 
just as important as attainment of any of the positive aims ” (p. 8-9). 


This range of opinion reported to be held by secondary teachers in the Republic 
in the 1970s contrasts with Beilin's suggestion that it was in the elementary school 
that the ‘ progressive ’ or " child-oriented ° philosophy had had its impact by the end 
of the 1950s, with secondary school teachers remaining largely subject matter oriented. 
ТЕ also contrasts with both the ethos of Irish reforms and Akenson's analysis of the 
Irish educational outlook which suggests a particularly anti-progressive perspective 
(at least prior to the 1960s). This mixed picture suggests the hypothesis that a change 
in teachers to a more ' progressive ' outlook has accompanied the educational reforms 
that have taken place in Ireland over the last 15 years. 


This hypothesis requires qualification, however. А. great deal of importance is 
claimed by Raven's teachers for the development of competencies and qualities of 
character such as independence, initiative, а sense of duty towards the community, 
the ability and the desire to read on one's own, an enthusiasm to learn, to take 
pleasure in communicating effectively, consideration, and developing a philosophy of 
life. Raven points out that “ none of these things implies mere knowledgeability or 
ability but . . . an interest, keenness, positiveness, and in general, a degree of self- 
initiated, self-motivated activity which is not commonly associated with the classroom". 
All were regarded as more important than, for example, broadening the pupils’ 
academic education, teaching them about a wide range of culture and philosophies, 
running extra-curricular activities, or teaching scepticism. Raven points to certain 
inconsistencies in these findings, highlighting the problems with questionnaire data 
of the kind relied on by him and suggesting some qualification to the aforementioned 
hypothesis: 


* [f it is important for pupils to have opinions of their own then it seems 
reasonable to expect teachers to also think that it is important to encourage 
pupils to be sceptical of statements made by others. Yet this does not appear to 
be the сазе... one should be careful not to read too much importance attached 
to developing independent opinions. Likewise, it may seem strange that, in 
view of the priority given to pupils being able to learn on their own, to having 
an open mind, to being open to new ideas, that teaching pupils about a wide 
range of cultures and philosophies does not come higher up in the rank order." 


Raven goes on to report a wealth of additional data on teachers and pupils that 
substantiate these anomalies. 


The apparent disjunction between what teachers claim as their objectives and the 
central focus of their concern in practice may be partly resolved by а phenomenon 
reported by Craig (1960) of differences between teachers' conception of their function 
in different levels of secondary school in Scotland. ‘Training for citizenship’ is a 
phrase sometimes used by junior secondary teachers and others to describe the work 
of the junior secondary school and, while the term is large enough to be applied to 
almost any aspect of the teacher's ‘job in either the junior or the senior secondary 
school, the specific interpretations put upon it by teachers in the two kinds of school 
reveal interesting differences in their job conceptions. Briefly, most secondary teachers 
at the senior level were preoccupied with the development of the intellectual faculties 
and considered that ‘ training for citizenship ° is inherent in the intellectual disciplines 
through which they guide their pupils; by contrast, the secondary teacher at the 
junior level approached ‘training for citizenship’ in terms of the inculcation of 
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general moral values (apparently with the belief that that undertaking was frequently 
lacking in parents). 

The foregoing account points to the limitations of direct interview/questionnaire 
approaches to the study of teacher attitudes and to the need to go beyond consciously 
construed educational objectives to the underlying individual meanings attached to 
those objectives and the processes which mediate between those objectives and 
actual practice. More oblique approaches to this subject have examined the complex 
of attitudes and perceptions vis-d-vis children of different social background, sex, 
ability level, personality type etc. underpinning those broader educational views. 
The object of the present study was to complement and to attempt to answer some of 
the questions raised by Raven concerning general educational objectives held by 
secondary teachers in the Republic of Ireland, with an exploration of the attributes 
in terms of which teachers actually characterise, differentiate and evaluate their pupils. 
These are examined both in terms of their content and internal structural organisation, 
and in relation to * objective" characteristics of the objects of those constructions 
(i.e. the pupils). The object of this exercise was, first, to address the question of the 
relevance of those areas which have received most attention in education for the last 
quarter of a century to teachers in the Republic of Ireland today. It explores how 
widely shared are particular concerns and objectives, whether they are thought to be 
equally appropriate to all pupils, in particular as between religious and comprehensive 
school pupils, as well as between male and female and urban and rural pupils. The 
second obiect of the study was to look at the potential of Repertory Grid Technique 
as а tool for exploring the realm of pupil perception. 


METHOD 


The specification of teachers’ perceptions and attitudes 
Psychological investigations of teachers' perceptions/attitudes have generally 
employed one of three different methodologies: 


(a) the a priori postulation of certain psychological elements or dimensions 
which form the basis for a questionnaire (e.g., Stouffer, 1956; Raven, 1977); 

(b) the content analysis of interview material, for example, the analysis of 
rationalisations of judgments of adjustment in pupils (e.g., Beilin, 1958; 
Silberman, 1969); 

(c) the study of teacher reactions of preference or approval to descriptions of 
hypothetical pupils (e.g., Feshbach, 1969). 


The variety of subsequent classifications of teacher perceptions have been derived 
either empirically (on the basis of factor analysis, for example) or, more usually, on 
conceptual grounds. The present study is not restricted to empirically based classifi- 
cations since it is arguable whether one can support conceptual distinctions by factors. 


A. The non-empirical classification of teachers’ perceptions 

While the global categories of ‘ traditional’ and ‘ progressive’ are the most 
common representations of different kinds of teacher attitudes, there are no very 
clear or consistent links between those global categories апа the descriptive elements of 
which they are said by different researchers to be composed. No very clear specifi- 
cation has been given of the various attitudes and perceptions that those categories 
subsume. For example, descriptions of the © progressive ° teacher have included any 
or all of the following elements: an expressive leader who compromises with the 
pupils to a greater extent; allows some talking while they are working privately; 
encourages pupils’ contributions to a discussion and contrives to work them into the 
topic if they are not altogether relevant; asks more questions than the traditional 
teacher; lectures less and states his/her own opinionless; has a sense of humour and 
encourages an occasional laugh; senses which pupils have problems of various kinds 
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and varies accordingly the way they are handled; tends to be fond of pupils; has 
extensive knowledge of pupils’ private interests, home backgrounds, ambitions, 
relationships to peers and other teachers. For simplicity it is frequently assumed that 
all tend to go together. Brophy and Good's (1974) description of * person-oriented ° 
teachers catches up the theme of a favourable attitude and close relationship with 
students, but also the belief that " helping them develop positive self concepts is an 
important part of their job’. The development of the person's moral capacities is 
another feature which progressivism is said to emphasise (e.g., Barker Lunn, 1970). 


In view of the multifarious nature of global categories such as this, the non- 
empirical classification system adopted in the present study was established at a less 
globallevel. Even at this level, the variety of category definition (both in terms of 
labels and items subsumed) makes their summary difficult, and none could claim 
categories that were mutually exclusive. Broadly speaking, the most common 
division of pupil perception defines four areas: Personality/emotional (e.g., extra- 
verted, self esteem);  Sociallinterpersonal (e.g., helpful to others, aggressive); 
Character (e.g., independence, leadership); Task-oriented (spanning the areas of 
ability and competence as well as motivation). The ‘ traditional’ teacher is said to 
focus on task-oriented factors and ‘ conservative’ attributes of character, while the 
© progressive" teacher is concerned more with pupils’ personal, social and general 
character development. 


In the interests of integrating the present research with earlier work, those broad 
categories have been preserved in the initial stages of the analysis (with the exception 
of the social and character categories, which were combined). However, the presenta- 
tion and discussion of the results will extend to the individual items subsumed, in 
view of the clear overlap between, and the heterogeneity of items within the summary 
categories. 


B. The empirical classification of teachers’ perceptions 

Another false impression created by the ‘traditional’ versus ‘ progressive’ 
characterisation is the notion that attitudes towards education form unidimensional 
systems. This notion presumably accounts for the fact that studies of teacher 
orientation have generally concentrated on one or two predetermined attitudinal 
dimensions. Studies of the dimensionality of the teacher attitude domain as a 
problem in its own right have demonstrated that * progressivism ' and ‘ traditionalism ’ 
are relatively independent dimensions, so that teacher attitude change may occur 
along one dimension and not another, or it may occur along both dimensions 
(Kerlinger and Kaya, 1959; Tuel and Shaw, 1966; Wolfe and Engel, 1978). 

Since any non-empirical classification system is liable to be in some way idio- 
syncratic to the research worker, and since a core aim of the present study is to go 
beyond teachers' expressed attitudes to their meaning for the individual concerned, 
the present study will also examine empirically the dimensions of perception of 
individual teachers and their interrelations. Repertory Grid Technique has recently 
been applied to the study of the process of 'typification" and pupil perception 
(Nash, 1973; Thompson, 1975), because it elicits constructs from people rather than 
imposing constructs on them (the more usual procedure in attitude and personality 
tests) without sacrificing the advantages of objectivity and quantifiability. It is used 
here to elicit from teachers the constructs which they use to characterise their pupils, 
and to demonstrate the individual meanings of those constructs through their topo- 
graphical organisation. 


Sample 
The study was carried out in the fourth year of three secondary schools in County 
Cork in the Republic of Ireland. The fourth year is the first in the two-year senior 
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cycle curriculum, generally completed at the age of 17 or 18. The fourth year pupils 
in the present study ranged from 15 to 18 years. Two of the schools were in the City 
of Cork: a boys’ and a girls’ voluntary secondary school, both long established and 
administered by the same religious order and catering for working class/lower middle 
class pupils. The third school was one of the new comprehensives, very modern in 
design and relaxed in atmosphere, having recently been established in a remote rural 
area where adequate facilities for second-level education had previously been un- 
available. All the pupils at this school were either from farming families or local 
townspeople. 

The teachers taking part in the study all taught in the fourth year, covered a range 
of specialisations, and were all within the young to middle aged bracket: two male 
teachers in the boys’ school, two female and one male teacher in the girls’ school, and 
two male and one female teacher (including a male careers teacher) from the 
comprehensive. 


Procedure 

In each school, the teachers who volunteered to take part drew up a list of all 
those fourth year students who were being taught by a minimum of two of them. 
These comprised the elements of the grid. In the two schools where three teachers 
were involved, the pupil population comprising the elements is not identical for all 
three teachers; however, in both cases there was a broad overlap. The number of 
elements ranged between 16 and 30. The grids were administered individually to the 
teachers, using the method of triads. Two constructs which were of relevance to a 
further study with the pupils were supplied if they did not emerge in the elicitation 
process; nine or ten triads were presented, the number varying according to whether 
the constructs to be supplied at the end emerged in the process of elicitation. The 
teachers were then asked if there were any characteristics which differentiated their 
pupils which had not so far been elicited. Only one teacher supplied an additional 
construct and this was added to her construct list. The teachers then rated all the 
pupils comprising their list of elements in terms of each of the constructs, using a 
7-point scale. The two supplied constructs were: 


Helpful, considerate towards other pupils 
Helpful, co-operative with the teacher 


The second part of the exercise involved the teacher in rank-ordering the 
constructs (elicited and supplied) according to their importance regarding the sorts 
of qualities which they personally believed to be the function of schools to promote. 

Finally, the ‘ objective” measure of personality employed, the 16PF, was 
administered to all the pupils involved. 


RESULTS AND DISCUSSION 


Content of teachers’ ratings of pupils 

Table 1 summarises the characterisation of pupils by teachers from different 
types of secondary school in one county in Ireland in terms of the classification 
system adopted. In view of his specialised function, the careers teacher's results are 
presented separately. It is notable that the range of constructs elicited from teachers 
in this manner is far narrower than that which the material supplied in studies such 
as Raven's would suggest is pertinent. 


Differences were found between teachers in the three different kinds of school. 
Regarding the global categories the boys’ teachers saw their pupils primarily in 
task-oriented terms (1.е., categories C and D); the teachers at the other two schools 
categorised their pupils primarily in terms of attributes of social character and 
personality. 
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TABLE 1 


PERCENTAGE ОЕ CONSIRUCIS ELICITED FOR EACH CATEGORY, AVERAGED ACROSS TEACHERS (% САТ) AND ACROSS 
SCHOOLS AND PERCENTAGE OF TEACHERS WITHIN SCHOOLS SUPPLYING CONSTRUCTS IN EACH CATEGORY (% TS) 


























Comprehensive school 
Ум Average 
, Girls’ Boys' Class Careers across 
religious school religious school teachers teacher schools 
Category % CAT %TS % CAT % TS % CAT % TS % CAT % CAT 
Personality 
А1 introvn/extravn 26 100 15 100 11 50 22 18-5 
A2 self esteem 7 66 5 50 16 100 22 12.5 
33 20 27 44 31 
Soclal Character 
B1 independence 10 66 0 0 0 0 0 25 
B2 openness 7 66 0 0 5 50 0 3 
B3 risk-taking 3 33 0 0 0 0 0 0-8 
B4 leadership 0 0 0 0 5 50 11 4 
B5 dependable 7 33 0 0 0 0 0 1-8 
B6 respect authority 10 66 15 100 16 100 11 13 
B7 respect peers 3 33 15 50 11 50 11 10 
40 30 37 33 35-1 
Competence] Ability 
СІ intelligence 3 33 10 100 0 0 11 6 
C2 likely success 0 0 5 50 5 50 0 25 
3 15 5 11 8:5 
Motivational 
D1 interest 17 100 20 100 16 100 1 16 
12 ambition 3 33 15 100 0 0 0 45 
20 35 16 11 20-5 
Miscellaneous 
El athletic 3 33 0 0 5 50 0 2 
E2 background 0 0 0 0 5 50 0 13 
E3 interest in opp. sex 0 0 0 0 5 50 0 13 
3 0 15 0 4-6 


The pattern of construct distribution within the global categories reveals further 
discrepancies between schools. The girls’ and comprehensive schools shared a focus 
on intra-personal factors; however, the girls’ and boys’ teachers were more attuned 
to introversion/extraversion than to self-esteem, while the opposite trend emerged 
among the comprehensive teachers. Regarding the distribution of constructs within 
the social character classification, a much wider range of qualities of social character 
came from the girls’ school than from the other two schools, where assessments were 
distributed predominantly in terms of pupils’ social relations with authorities and 
peers. Of the task-oriented constructs, the largest discrepancies between schools 
concerned the intelligence and ambition categories; both were employed by both 
boys’ teachers, by only one of the three girls’ teachers and by neither comprehensive 
teacher (although ability was supplied by the careers teacher). That is, the inclination 
to classify pupils in terms of interest and motivation rather than in terms of ability 
and competence was more marked in the girls’ and comprehensive schools than in 
the boys’ school. 


Hence, the usual ascription of an increasing preoccupation with task-oriented 
and character factors to senior secondary teachers, and a confinement of so-called 
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* progressive ог ‘ child-centred ’ concerns of personal/emotional development to the 
lower levels, may obscure interesting modifications in relation to particular environ- 
ments. 


Rank ordered evaluations of constructs as educational objectives 

The overall rankings of constructs in terms of their perceived educational 
importance were calculated for each school by summing across teachers’ five most 
highly ranked constructs, individually weighted 1 to 5 according to order of importance. 
The weighted ranks for individual teachers and the summed ranks for each school are 
presented in Table 2. 


Four out of five of the five most highly ranked objectives вина to social 
character development for the girls’ teachers; interest also received a high priority. 
While there was a unanimity in those teachers" views about the priorities which should 
be given to the socialising function of school, there were differences in opinion 
regarding the direction that character development should take. Two of the teachers 
emphasised concern about society and respect for peers, while the expressed priorities 
of the third were for openness and independence. 


The average ranking for the boys' teachers placed ambition at the top of the list; 
however, while one emphasised the supreme importance of ambition, the other 
TABLE 2 
THE AVERAGE RANK ORDERS OF Five Most Нісніу RANKED CONSTRUCT CATEGORIES FOR INDIVIDUAL 
‘TEACHERS* AND THE SUMMED RANKS FOR INDIVIDUAL SCHOOLS 


Comprehensive school 
ЖЕ j Boys’ Class Career 
religious school religious school teachers teacher 


Category TIA T2A T3A Total TIB T2B Total TIC T2C T3C Total 








Personality 


А1 introvn/extravn 2 (ex). 2 4 (int.) 4 
A2 self esteem 3 3 5 5 2 5 4 11 


Social Character 


B1 independence 4 
B2 openness 5 
B3 risk-taking 

B4 leadership 

B5 dependable 

B6 respect authority 
B7 respect peers 


Competence] Ability 
C1 intelligence 

C2 likely success 
Motivational 


D1 interest 4 
D2 ambition 


Und 


25 


оу) о 
— 
ouu 
t3 
CA 
tà 
Un 
~ 
tA 


wh 
мә 
— 


45 5:5 


Miscellaneous 

E1 athletic 

E2 background 2 2 
E3 interest in opp. sex 





* highest rank = 5 
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inclined to the intra-personal factors of self-esteem and sensitivity development. 
Both attached some importance to the development of respect for peers, and one 
regarded the development of a proper respect for authority as more important than 
the latter. ' 


Unanimity of expressed objectives was marked at the comprehensive. Both the 
class teachers and the careers teacher saw self-esteem and respect for peers as the most 
important objectives of their work, closely followed by interest. One of the teachers 
also saw respect for authority as important. Further research into this question with 
a larger sample of teachers might indicate whether this reflects a gulf between the 
pragmatic comprehensive ethos and the attitudes of individual teachers involved in 
its implementation. Raven’s evidence suggests that in practice teachers in the 
Republic are more concerned with the immediate end of examinations than is sug- 
gested by the aims they articulate. An analysis of the individual grids examines this 
question in the context of these particular schools. 


The individual grids 

А number of the assumptions involved in the foregoing classification schema 
might well be questioned. One is the assumption that constructs provided by the 
teacher which, on common sense grounds, sound similar are in fact essentially the 
same. Grounds for caution in accepting this assumption have been discussed by 
Hargreaves (1977). А further assumption is that meanings are transpersonal, that is, 
shared by all teachers. These assumptions are inevitably involved in any classifi- 
cation schema, and Hargreaves points out that in the one Repertory Grid study that 
had the potential for exploring these assumptions Nash (1973) gave only one illus- 
tration of that interpretive work, providing no means of checking whether these 
assumptions constituted a distortion of the data. The following analysis represents 
an exploratory investigation of those assumptions. 


Makhlouf-Norris et al. (1970) developed an approach for the analysis of the 
structural organisation of constructs, which interprets their topographical organisation 
according to the pattern of significant correlations between them. Two main types 
of structure are discerned: articulated and non-articulated. The normal conceptual 
structure is articulated; it contains at least two different clusters which are joined 
together by linkage constructs. In contrast, in their application of this procedure to 
the thinking of obsessional neurotics, they found that the obsessional conceptual 
structure is generally non-articulated; it tends to be either monolithic, consisting of 
one dominant cluster with secondaries, or segmented, consisting of more than one 
cluster but with no linkage constructs. They found that 88 per cent of the normal 
control group had articulated structures, while only 36 per cent of the obsessionals 
had articulated structures. There is subsequent evidence that these differences may 
be general differences between © neurotics ' and ‘ normals’, and not restricted to the 
obsessional neurotic (Fransella and Bannister, 1977). 


Makhlouf-Norris and Norris (1973) defined ‘ primary ’ clusters of constructs as 
groups of constructs which are all significantly correlated together but which are not 
significantly related to constructs in another cluster. The relationships of the re- 
maining constructs were classed as follows: 


(a) A secondary construct, or cluster of constructs is significantly related to 
some, but not all, of the constructs in a primary cluster. 

(6) Linking constructs are significantly related to one or more constructs in two 
or more independent primary clusters. 

(c) Tertiary constructs are not significantly related to constructs in a primary 
cluster, but to secondary linking constructs. 

(d) Isolated constructs are not significantly related to any other constructs. 


G 
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The correlations between constructs are calculated in the INGRID programme 
(Slater, 1972) employed here on the basis of the angular distances between the 
constructs. The criterion of significance was set at 0-4, being roughly equivalent to 
the critical value of correlation coefficients for М = 30, х = 0:05. The number ої 
elements rated by teachers ranged between 25 and 30. One exception to this was 
13A who rated only 16 pupils; the level of significance adopted for this teacher was 
0-5, roughly equivalent to the critical value for М = 16, « = 0-05. 


The grid analysis produced very few genuinely articulated structures (25 per cent), 
and a relatively large proportion of segmented (38 per cent) or monolithic (38 per cent) 
structures. These findings are very similar to the results obtained by Makhlouf-Norris 
et al. (1970) with their group of obsessional neurotics, and very different from the 
predominantly articulated structures of their normal control group. Furthermore, 
that study used roughly the same number and type of elements (i.e., 20 people) and, 
although it elicited as many as 16 constructs, evidence suggests that the structure of 
the grid is not significantly altered by the emergence of more than seven constructs 
(Chetwynd-Tutton, 1974). 


The diagrams of construct relationships are presented in Figures 1 to 3, revealing 
individual differences in grid structure both between individuals and between schools. 
None of the structures demonstrated by the teachers from the girls’ school was 
articulated: the structures Гог TÍA and ТЗА were monolithic, and the structure for 
T2A segmented, The structures for the boys’ teachers were also non-articulated. 
Whereas T1B showed linkage between two primary clusters, those were not integrated 
with the third and largest primary cluster, making this a segmented structure. T2B 
had a monolithic structure. At the comprehensive, one of the class teachers and the 
careers teacher had articulated structures, with two linked primary clusters; T1C had 
a segmented structure. 


In the monolithic structure, within the limits of the correlations between con- 
structs, any one construct implies the other. Constructs are selected to fit together in 
an almost invariable way, and constructs which set up opposing implications are 
avoided. Makhlouf-Norris et al. (1970) found, for example, that obsessionals avoid 
the use of constructs that set up opposing implications, narrowing the range of their 
judgments and reiterating judgments on the same theme. At the girls’ school, the 
monolithic structure of TIA consisted of a primary cluster concerned with strength 
of personality and character, related to secondary and tertiary groups of personality 
factors. The only task-related construct was Hard-working which appeared in a 
secondary cluster together with Dependable and Gentle, related to a lesser extent 
to Meek. There was a far greater use of task-related constructs by T3A—Interested, 
Hard-working, Ambitious—all of which were primary in this teacher’s thinking, 
though centrally linked to aspects of social character development (Open, Considerate). 
This is similar to the monolithic structure for the boys’ teacher, T2B. 


In segmented structures, independent judgments are made which have no impli- 
cations for each other. There is no means by which one part of the system can 
influence another; the system lacks any overall cohesive idea. Edmondson (1978) 
in a study of careers teachers’ perceptions of occupations found segmentation with 
respect to practical and blue collar occupations on the one hand, and occupations 
involving dealing with people, and white collar, professional occupations on the other. 
She found that those latter two groups or clusters are frequently combined or related, 
and opposed to the first ‘ practical’ cluster. In the case of people perception (cf. 
Makhlouf-Norris et al., 1970), a segmented structure involves and permits discrete 
cataloguing of the separate aspects of a person, but cannot bring these together into 
a single identity. This contrasts with the monolithic structure where independent 
judgments with opposing implications cannot be made and where there is a tendency 
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FIGURE 1 
INDIVIDUAL Grip STRUCTURES FOR GIRLS’ SCHOOL TEACHERS 
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FIGURE 2 
INDIVIDUAL GRID STRUCTURES FOR Boys’ SCHOOL TEACHERS 
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to make judgments which mean the same thing. The segmented thinking of T2A at 
the girls’ school consisted of two unrelated primary clusters. Application and ability 
were seen as going together with social maturity; an extraverted, outgoing nature 
(Vivacious, Vocal in Class) was also seen as integral to those instrumental and social 
ш Quite independent of this cluster was another primary cluster reflecting 

e other main factor of personality to emerge in the study: Relaxed, Self-esteem. 
Its independence from the first cluster is in stark contrast to the importance attached 
to self-esteem in the research on educational achievement, but may be seen as con- 
sistent with the relatively low frequency of its occurrence in the religious school 
setting. Reserve with Adults, Tractable, and Uninterested were isolated constructs; 
again, it is striking that interest is in no way integrated by this teacher with any other 
construct(s), especially considering this teacher ranked it second in importance. It 
can be seen that all three girls’ teachers see intra- and inter-personal development as 
both central and either essentially intertwined with, or undifferentiated from, the 
development of task-appropriate skills. 


The segmented thinking of ТІВ at the boys’ school contrasts with this picture. 







RACHAEL М. HENRY 359 


БІСОВЕ 3 
INDIVIDUAL GRID STRUCTURES FOR COMPREHENSIVE SCHOOL TEACHERS 
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Here, motivation was seen as going together with social maturity, and the extraversion 
cluster (Sensitive, Shy) was independent of that first cluster. There were also no direct 
links between extraversion and the ability cluster; however, a link between the two 
provided by Defeatism caters for both congruent and incongruent implications between 
extraversion and ability, based on a positive link between lack of ability and defeatism 
and between introversion and defeatism. This teacher saw lack of ability as also tied 
to deceit. Vocal in Class was isolated in this teacher’s thinking, also in contrast to 
T2A who saw it as an expression of personality and ability. 
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The segmented thinking of the teacher at the comprehensive school again con- 
trasts with each of those examples of segmented thinking at the boys' and girls' 
schools. The first primary cluster for T1C reveals that secondary school pupils’ 
current level of maturity and future prospect of success were seen by this teacher as 
relating only to their academic background, with Athletic secondarily related. Quite 
independent of this first group of constructs, the second cluster comprised a serious 
attitude to work and to the teacher, related in turn to consideration for others. The 
personality variable most prominent in the thinking of the religious school teachers, 
namely extraversion, is only peripherally involved here in the guise of friendliness. 
The fact that no connections were made by this teacher between future success and 
the various task, and social character and personality factors elicited adds weight to 
the suggestion that the comprehensive teachers may be rejecting, in quite a radical 
way, the pragmatic ethos of comprehensivisation. Finally, T1C, in a manner similar 
to Т2А, also saw self-confidence as independent of other constructs; it should be 
noted that this construct was ranked highly by both teachers. 


The only two fully articulated structures emerged from the remaining two 
comprehensive teachers. T2C exhibited two primary clusters linked together, 
integrating personality, motivational and social character factors. The first cluster 
was predominantly social (Co-operative with Teacher, Well-mannered, Mature); the 
appearance of Dedication in this cluster indicates its social character in this teacher's 
thinking. The conjunction of Outspoken and Enthusiastic nominates the second 
primary cluster as one concerned with * participation". While superficially resembling 
Vocal in Class, in contrast to Т2А who saw that attribute as a direct expression of 
attributes of social character/personality/intellect, and in contrast to ТІВ who saw it 
as a quite isolated construct, unrelated to anything else, in this teacher's thinking 
enthusiasm and outspokenness are not directly linked to the social/instrumental 
attributes of the first cluster, but do have indirect links through the construct Leader- 
ship. The isolation of Easy-going and Shy from any other construct in this teacher's 
system may point to the lesser importance of extraversion/introversion in these 
comprehensive teachers’ thinking. The careers teacher was the only other teacher 
with an articulated system. His first factor conjoined intellectual and personal 
maturity (Inquiring Mind, Placid, Mature), related to a lesser extent to social develop- 
ment. The second cluster consisted of Authoritarian and Introverted, linked to the 
first cluster by Leadership, Realistic and Able; presumably this teacher perceives the 
authoritarian as evincing some of those abilities to deal with (manipulate?) the social 
and physical environment that contribute to a mature intellectual and personal 
development. 

These findings illustrate the varying and, with the exception of the comprehensive 
school, generally low levels of differentiation and articulation in people's construct 
systems even in areas central to them. They also reveal the inadequacy of verbal 
classification as a means of representing the frames of reference for individual teachers. 
This is highlighted in the summary below of the varying significance of some of the 
most common constructs and some of the principal classifications. 


Only two out of the six teachers who supplied a construct classed as self-esteem 
incorporated that construct in an articulated structure. The other four saw it either 
as isolated or undifferentiated from a host of other variables. This may reflect in 
part the inadequacy of the procedure of classing a general term like * maturity ' asa 
reference to self-esteem.  Extraversion/introversion also had a very variable signif- 
cance: ТТА saw introversion as relating positively to application or ambition; T2A 
and Т2С saw extraversion relating to hard work and/or ability; and while T2B saw 
extraversion/introversion as quite independent of ability, ТІВ describing the same 
population, saw an indirect link between the two. 

The findings for the ability constructs cover an equally broad spectrum. T2A 
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from the girls’ school saw intelligence as part of a broad group of personality/social/ 
motivational variables. One of the boys' teachers (T2B) saw it as isolated from any 
other variable, the other related its lack to deceit and, to a lesser extent, to defeatism. 
Regarding the motivational constructs, while T3A from the girls’ school saw them as 
primary, TIA not only saw Hard-working as secondary, but linked it with the female 
stereotyped attributes Dependable and Gentle. 


Likelihood of success was linked by the comprehensive teacher only with the 
very broad, undifferentiated factors of maturity and academic background, whereas 
the boys' teacher linked it to a range of social/motivational/personality factors. 


The * objective? measures of the pupils 

The final approach to understanding the meaning of teacher construct systems 
in this study was to look at them in relation to their objects (that is, the * objective ' 
properties of the elements). In most educational systems it is expected that a teacher 
be aware of psychological differences between students, with a growing emphasis on 
pupils’ personality development. School records increasingly rely on psychological 
terms like * aggression °, ‘ introversion ° and * anxiety ° and go on to play an important 
part in the screening of applicants for jobs, or at least in establishing the frames of 
reference through which subsequent teachers view pupils. Hence, the question of how 
well teachers can perform that evaluation has come under scrutiny. Type of school, 
age, social class, sex of pupil, age and sex of teacher have all been shown to affect 
teachers’ personality ratings of pupils in primary and secondary schools (e.g., Hall- 
worth, 1961; McIntyre et al., 1966). 

The foregoing analysis has pointed to the wide range of meaning and significance 
attached to common constructions. The departure from the traditional practice, both 
in this area and in personality and attitude research generally, of supplying subjects 
with items for assessment, has avoided the assumption that those supplied items are 
in fact significant to and representative of the outlooks of the individuals in question, 
and the analysis of the individual grids explores the further assumptions regarding 
the intrapersonal and transpersonal meanings of items. The problem which remains 
with repertory grid data is that it is limited to constructions which are conscious and 
which the individual chooses to articulate publicly. The examination of the repertory 
grid data in relation to the ‘ objective’ 16PF scores of pupils addresses this problem. 


Two groups of pupils were selected on the basis of their favourability rankings: 
a group of pupils unanimously approved (APP), and a group of pupils unanimously 
disapproved (DAP). These pupil-favourability scores were calculated for each teacher 
by weighting the ratings each pupil received on each of the constructs in accordance 
with the rankings applied by the teacher to those constructs in terms of educational 
desirability. Only the first seven ranks were used. The APP group comprised those 
pupils who fell in the top third of the distribution for all teachers concerned; the DAP 
group comprised all those pupils who fell in the bottom third of each teacher's 
distribution. Certain features of the distributions of 16PF scores for both groups at 
each school are selected for discussion in the context of the traits (i.e., constructs) 
which served best to discriminate for these teachers between good and bad pupils 
(see Table 2). The 16PF scores are classified on the following basis: 


Low: Sten score of 1-4 (expected frequency of 31 per cent) 


Mid: Sten score of 5 or 6 (expected frequency 38 per cent) 
High: Sten score of 7-10 (expected frequency 31 per cent) 


The observed numbers of scores falling into each of these classes are produced in 
Table 3. 


The girls’ school. These teachers were unanimous in emphasising the socialising 
function of school: two emphasised concern about society and respect for peers; 


Pupil Perception т Ireland 


362 





ОРГ I v O ть о Ро т о F | "ко т I Tt EÈ Z Izz Тооцоб 
: eAIsUSqarduio 








9= м 
ddy 


Тт € @ £ с I с € I Р р t t Z Z 0 Z v c c C £ I @ 199455 
злтепоцоїдашог? 








тем 
dva 


о Р 0 со 0 € 0 Е I 0 © с 0 10 € 0 € I 0 c c pope mod 











3408 





SPID 








гоо ct є @ с t£ L тт 1d ЕЁ £ ЕТЕ I Ft У IZ м 


нит ни ал нит ни тп нип HW T нит нит 98ue 22025 UAS 
I H 5 4 ч о 9 У 91205 4491 

















з папа анлочачузвіа ANY анлочачу AHL нол TIVOS 4491 ант мо 599005 NALS (01-/, = н)нын ачу (9-5 = W) итмази '(p-T = 71) AOT яо ичн 


Є НЛНУЈ, 


363 


RACHAEL М. HENRY 





=N 
dvd 


0 € € 0 С € рос @ 0 0 $ с @ I ft I € I I£ 0 Є @ pops 
әлұвоәцәашод 











9=N 
ачу 


£ € 0 Ii * I о € нгго Zt v O Ipp I ТЕ с с єї 10025 
94150э4219 шод 











PN 
дуа 


£ 0 т ОТЕ I € о I t£ о с I I те 1 I о € II Z I porem 











с с 0 I Xt t € Xt © YI v Т € € I S 1 її СГУ ruo a 





=N 
дуа 
с 0 1 0 о т со ттт сто от c оо € I^ © 0 за 
SHI 





€ I € б € € Є БЕБЕ 5 c 9 € РО 1 t € cC < I Р 100025 snorsrout 
бно 








ни T ни I нит инт нит нит нит нит esu 91025 US 
+0 £0 co TO о N и 1 91905 4491 








(ргиоә) є AIAVL 


364 Pupil Perception т Ireland 


the third stressed openness and independence. It will be recalled that the widest range 
of qualities of social character emerged from the teachers at this school. 


The 16PF findings do not reflect those traits, pointing to the narrow meaning 
attached to those variables of social character. The APP pupils were mostly not 
dominant (E), suspicious (L) or radical (Q1); by contrast, none of the DAP pupils 
was submissive (E), trusting (L) or group dependent (Q2), belying the significance 
accorded to independence. Furthermore, neither group was distinguished on venture- 
some or superego strength (G). All three of the DAP pupils were of low intelli- 
gence (B), whereas the APP pupils were distributed across that dimension. 


Hence the girls’ teachers’ nominations of the widest range of, and highest priority 
to, social constructs, were not mirrored by the profiles of 16PF scores. These findings, 
considered together with the monolithic character of the teachers’ grids, may signify 
limitations to interpreting data on construct frequencies and rankings in isolation from 
the individual grids (cf. Nash, 1973). 


The boys’ school. In contrast to the girls’ teachers, the APP boys were selected 
for ambition, the intrapersonal factor of self-esteem and the interpersonal factors of 
respect for peers and for the teacher. 


The importance attached to sensitivity was consistent with the 16PF profiles in 
so far as the APP boys were mostly tender-minded (1) and imaginative (M), while the 
DAP boys were low to middling on these factors. The overall evidence is also 
consistent with the teachers' positive evaluations of self-esteem: while the APP boys 
were mostly self-assured (О), high in ego strength (C) and trusting (L), the DAP group 
did not share this pattern for self-esteem. However, both groups tended to low 
integration (Q3) and high ergic tension (Q4). Regarding ambition, which was 
perceived to be the single most important factor differentiating APP from DAP, the 
APP boys were only average on venturesome (H), and low on assertive dominance 
(Е), while the DAP boys were middling to high on these two factors. It might be 
argued that the appearance of ambition as one of a host of primary cluster variables 
in both teachers’ grids pointed to a lack of differentiation with respect to this construct, 
and the findings here are consistent with the grid implications that ambition is linked 
closely by these teachers to conformity. Finally, while ability was prominent in the 
boys’ teachers’ characterisations of pupils, it did not feature strongly in their evalu- 
ations, and it is striking that both APP and DAP were low to middling on intelligence 
(B), in contrast to the two groups of girls who were differentiated on B. 


The comprehensive school. At the comprehensive, self-esteem and respect for 
peers were perceived unanimously as the most important objectives, closely followed 
by interest. An unexpected finding was that all the pupils comprising the APP group 
were female, and all the pupils in the DAP group were male. 


Regarding self-esteem, the APP group in fact were mostly low in ego strength 
(L) and high in ergic tension (Q4); the DAP group was mostly high in ego strength, 
and low to middle on ergic tension. Both groups were moderately high on guilt 
proneness (O). While the APP group was mostly venturesome (H) and had high 
levels of integration (Q3), and the DAP group mostly low on integration, the overall 
picture is not consistent with the salience of self-esteem for these teachers. More 
seriously, it is at odds with the individual grid findings which described the compre- 
hensive teachers’ constructions as more articulated, particularly with regard to this 
factor. In contrast to the religious schools, at the comprehensive, where independence 
was not assigned a high priority, and where ambition was conspicuous by its absence, 
the APP group was mostly radical (Q1) while the DAP were all conservative. Both 
groups did, however, tend to be group dependent (Q2), and were middling to high on 
the superego factor (G). Finally, while ability was not prominent in teachers' 
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constructs, the APP group were evenly distributed across the dimension of intelligence, 
while the DAP group were mostly of low intelligence. 


CONCLUSIONS 


The present study represents a small beginning in the attempt to evaluate the 
usefulness of Repertory Grid Technique in the investigation of teachers’ attitudes. 
Both the dimensions in terms of which teachers discriminate between pupils and their 
articulated objectives found the teachers in the comprehensive school more attuned 
to the expressive, non-academic qualities of their pupils than to their instrumental, 
task-related attributes. Furthermore, while it is true that the comprehensive teachers" 
construct systems were generally differentiated and articulated structures, they were 
nevertheless relatively undistinguished with respect to task-related constructs. Indeed, 
the wide range and generally incoherent nature of the connections perceived to hold 
between traits of social character and instrumental attributes may indicate the need 
to qualify the suggestion, that has emerged from studies such as Craig's (1960) that 
* training for citizenship ' is seen by secondary teachers to be inherent in the intellectual 
activities of their chosen discipline, hence their task-oriented approach. 


Clearly the generality of such findings as modifications in teachers’ perceptions 
in accordance with pupil's sex in the religious schools, and concern about the preser- 
vation of traditional sources of authority in all schools, will only be determined by a 
wider investigation. 


The evidence both from the grids and the 16PF confirmed the common finding 
that pupil personality is invariably tied in some way to pupil evaluation. It did not 
confirm the salience of some clinical concept of * withdrawal’, singled out for study 
by Beilin: while the constructs defining extraversion/introversion did feature in 
teachers' observations, they did not emerge in any consistent relation with the task- 
related constructs, and were less evident in the comprehensive teachers! perceptions 
than constructs relating to self-esteem. The resurrection of the ‘ self’ as a central 
construct in personality theory, going along with the challenges to the medical model 
which have come from humanistic psychology, may account for its significance for 
these comprehensive teachers, as compared with the clinical perspectives of Beilin's 
teachers in the 19505. Furthermore, the pattern of findings for the factors underlying 
the 16PF second stratum factor of adjustment versus anxiety was not consistent 
either between teachers or even between the repertory grid and 16PF findings. 
What was most striking about the results was the wide range of meaning and signifi- 
cance subsumed under clinical or psychological constructions such as * extraversion/ 
introversion' and ‘self-esteem’. The disjunctions between the grid data and the 
16PF data are hardly surprising in view of the heterogeneous character of such 
constructions. 


The findings did, then, bear out the need to go beyond conscious constructions 
and articulated objectives to examine their underlying individual meanings. They 
also supported the suggestion that the inconsistencies found by Raven (1977) md 
teachers' professed objectives on the one hand, and their actual classroom practice on 
the other, may be attributed to the methodology employed: the fact that the range of 
constructs elicited from teachers in the present study was so much narrower than the 
range of issues which Raven's questionnaire/interview studies suggest are salient, 
nominates that methodology as a likely principal contributor to those anomalies. 
One of the most striking findings of the study was the generally low level of articu- 
lation and differentiation in teachers’ construct systems in an area of central signifi- 
cance, being very much inferior to the perceptions of significant others by а * normal’ 
sample outside the teacher/pupil context (cf. Makhlouf-Norris et al., 1970). Teachers 
most commonly perceived pupils from either a monolithic or segmented perspective, 
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that is, either in accordance with a narrow range of judgments, all reiterating the same 
underlying theme, or in terms of a number of disparate, unrelated categories lacking 
any overall cohesive idea. Factors which have achieved prominence in educational/ 
psychological thinking in the last decade were rarely incorporated in an articulated 
structure. 


Several constraints on the interpretation of repertory grid data were indicated by 
the findings from the grids and 16PF. Interpretations of the construct frequency and 
rank data were seriously modified by the analysis of the individual grids, bearing out 
Hargreaves' reservations regarding the assumptions of intrapersonal and transpersonal 
meaning left unexplored in Nash's (1973) work. Furthermore, the disjunction between 
the repertory grid data and 16PF results may point to the limits to even the more 
sophisticated analyses of repertory grid data considered in isolation from external, 
* objective ' sources of information about the elements. In defence of the repertory 
grid as a tool in its own right, several explanations might be advanced as diminishing 
the significance of the 16PF results. One might be that teachers’ labellings do not 
conform to the operational definitions of personality constructs in psychology, and 
that this hardly challenges the value of Repertory Grid Technique since the elucidation 
of such phenomena is precisely its object. А second challenge to the author's con- 
clusion might be that those 16PF factors which emerge as significant discriminators 
between the two groups of pupils but which were not reflected in the teachers’ grids 
may not, indeed, be part of some unconscious or unarticulated selection mechanism 
but rather factors which simply happen to correlate with one or more of the dimensions 
which do have a place in the teachers’ construct systems. However, the present 
findings cannot be dismissed either as simply а ‘ labelling’ phenomenon (except in 
the sense of misrepresentation) nor as an instance of confounding variables when 
considering in the light of some of the total non-correspondences of meaning, as 
demonstrated in the differences between the teachers' constructions and the 16PF 
factors. 
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EFFECTS OF ACADEMIC DEPARTMENTS ON STUDENTS" 
APPROACHES TO STUDYING 
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(Institute for Post-Compulsory Education, University of Lancaster) 
AND N. J. ENTWISTLE 
(Department of Education, University of Edinburgh) 


SUMMARY. 2208 students from 66 academic departments in six contrasting disciplines 
from British universities and polytechnics completed an ‘approaches to studying’ 
inventory and a course perceptions questionnaire. Factor analyses of these instruments 
confirmed the factor structures previously reported. Approaches to studying can be 
described in terms of three main factors—orientations towards personal meaning, 
reproducing, and achieving. In the present analysis the final factor split into two: 
achieving orientation and a factor labelled “disorganised and dilatory’ which showed 
a close relationship with self-rating of academic progress. The course perceptions 
questionnaire produced two main factors. One described formal teaching methods, 
vocational relevance, and clear goals and standards, and the other represented a favour- 
able departmental evaluation with the highest loadings on good teaching and openness 
to students. Subsequent analyses examined links between students’ perceptions of 
their main academic departments and their reported approaches to studying. Depart- 
ments with highest mean scores on meaning orientation were perceived as having good 
teaching and allowing freedom in learning. Departments with the highest mean scores 
on repreducing orientation were seen to have a heavy workload and a lack of freedom 
in learning. The implications of these statistical findings are discussed in relation to 
continuing analyses of interview data which clarify the ways in which the organisation 
of teaching and courses may affect students’ approaches to learning. 


INTRODUCTION 


A SYMPOSIUM of articles on ‘learning processes and strategies’ ran in the British 
Journal of Educational Psychology between February 1976 and November 1978. These 
articles described a series of constructs related to the learning processes of students 
which attempted to explain characteristically different approaches to, and styles of, 
studying. 

Marton and Säljö (1976a) distinguished between deep and surface approaches to 
reading an academic article. Essentially the deep approach involved an active attempt 
by the student to understand the author’s meaning, to explain the evidence in relation 
to the conclusion, and to relate the ideas contained in the article to the student’s 
previous knowledge and experience. The surface approach, in contrast, was charac- 
terised by a tendency to memorise discrete facts or ideas, to be anxiously aware of the 
need to reproduce information at a later time, and to view a particular task in isolation 
both from the academic subject as a whole and from real life. 


Svensson (1977) argued that a deep approach to studying was functionally related 
to both conscientious and effective study methods and to examination performance, 
while Marton and 5416 (19765) warned that questions which encourage the regurgi- 
tation of factual answers are likely to shift а student towards a surface approach. 
Fransson (1977) was able to demonstrate that the approach to learning depended on 
perceived relevance and anxiety: interest in the subject matter of the article en- 
couraged a deep approach, while a stressful learning situation produced more surface 
learning. 

In Marton's research the instructions to the students were ambiguous in terms of 
the type of learning required: the students had to decide for themselves whether 
understanding was necessary. Pask (1976a, 1976b), however, demanded evidence of 
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understanding in most of his learning experiments. Yet here again students used 
different strategies. Some students (Aolists) relied heavily on analogies, illustrations 
and anecdotes in building up a general understanding of a topic. Other students 
(serialists) concentrated on step-by-step learning and on the detailed arguments and 
evidence presented. In normal academic environments extreme styles of learning are 
apparently associated with characteristic pathologies. Operation learners (who 
consistently adopt serialist strategies) fail to build up a general picture of what is to be 
learned and ignore important analogies or inter-relationships between ideas (improvi- 
dence), while comprehension learners (consistent holists) have a tendency to jump to 
unsubstantiated conclusions and to make implausible links between ideas (globe- 
trotting). 

А subsequent symposium in Higher Education (July, 1979) highlighted the im- 
portant effect of the context of learning on both the approach (Ramsden, 1979) and 
the strategy (Laurillard, 1979) adopted. Students were not consistent. Their approach 
varied to some extent from department to department and from task to task, and 
Students also varied their strategies across different types of task. Оп the other 
hand, two studies (Entwistle et al., 1979; Biggs, 1978, 1979) showed that inven- 
tories could be used to distinguish between characteristic orientations to studying 
in ways which implied a certain consistency of approach. It is necessary, therefore, 
to accommodate both consistency and variability in any scheme which seeks to 
describe the ways in which students approach learning tasks (Entwistle, 1979). 


From the work of Ramsden (1979) and Laurillard (1979), it is now clear that 
variability in approach or style is partly a function of differences between individual 
academictasks. Butthere was also evidence in Ramsden's study that students respond 
to the context of learning defined by the teaching and assessment methods of academic 
departments. Some departments and some lecturers seemed to facilitate а deep 
approach, while others used methods of teaching, or made course work demands, 
which forced students into surface approaches. In interviews, students clearly 
perceived lecturers as affecting their approaches to studying. These observations 
were in contrast to previous unsuccessful attempts (see, for example, Dubin and 
Taveggia, 1969) to find relationships between different methods of teaching and 
student learning. It thus seemed important to seek additional evidence about which 
particular aspects of departmental organisation appear to influence the ways students 
study. 

It should be possible, at least in theory, to describe departmental context in terms 
of the educational objectives espoused by staff, and the teaching methods, syllabuses, 
and past examination papers. However, it proves difficult to make equivalent com- 
parisons between departments using such criteria. An alternative approach is to 
capitalise on the fact that it is not so much how staff say they operate that is important, 
but how the students perceive the courses and the teaching. In an earlier part of this 
study, interviews with students led to the development of a questionnaire of course 
perceptions (Ramsden, 1979). Factor analysis of the items, supported by conceptual 
analyses of the interview data, suggested that students described departments in terms 
of eight partially overlapping dimensions. One group of sub-scales distinguished 
between faculties: these were formal teaching methods, clear goals and standards, and 
vocational relevance. These characteristics were most commonly found in science 
and technology departments. The second group of sub-scales distinguished between 
the most and the least favourably evaluated departments: good teaching, freedom in 
learning, openness to students, heavy workload and, to a lesser extent, good social 
climate. Thus some important aspects of departmental learning context can be 
described in terms of the sub-scales of this questionnaire. 


An SSRC research programme at Lancaster has been trying to develop the ideas 
of Marton and Pask in relation to earlier work in Lancaster and Aberdeen on students’ 
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motivation and study methods (Entwistle and Wilson, 1977). An inventory of ap- 
proaches to studying was developed (Entwistle et al., 1979) which, from an initial 15 
sub-scales, produced three main study orientations: (personal) meaning orientation 
(deep approach+comprehension learning); reproducing orientation (surface ар- 
proach + operation learning); and achieving orientation (organised study methods + 
achievement motivation). These three second-order factors were closely similar to 
those obtained independently by Biggs (1978, 1979) with Australian students. 


The present study brings together results from revised versions of the * approaches 
to studying ' inventory and the course perceptions questionnaire to explore the extent 
to which approaches to studying can be explained in terms of students' perceptions of 
their courses. 


METHOD 

Method of measurement 

The questionnaire administered to students consisted of three parts. The first 
section asked for background information about school examination results and 
honours specialism(s), and also contained a self-rating question in which students were 
asked to assess their own academic progress to date (How well do you think you are 
doing so far on this subject/course, compared with other students?) A similar 
approach to self-assessment of mathematical aptitude proved successful in an earlier 
study (Entwistle and Wilson, 1977), with a correlation between self-rating and object- 
ive test score of -- 0:65. 


The second section of the questionnaire contained a shortened and refined version 
of the * approaches to studying ' inventory with the 16 sub-scales shown in Figure 1. 
Evidence of satisfactory reliability for the earlier scales has already been reported 
(Entwistle et al., 1979). 


The third section contained the eight sub-scales of students’ perceptions of their 
honours department courses; these are also shown in Figure 1. 


Sample 

A letter describing the purpose of the investigation was sent to 171 departments 
in 54 universities and polytechnics in England, Wales, Scotland and Northern Ireland. 
Ninety-five departments agreed in principle to co-operate, and an adequate proportion 
of completed questionnaires for analysis was eventually obtained from 66 of them. 

The target population was second-year undergraduates (third-year in Scotland) 
taking honours degrees in departments of English, History, Economics, Psychology, 
Physics or Engineering. The six disciplines were chosen to provide a range of special- 
jsms; five of them had been used previously in the interview study (Ramsden, 1979). 

The response rate from students was estimated to be 73 per cent. (Returns from 
departments showed the class size, but it was not always possible to be sure exactly how 
many of the class had received the questionnaire.) Students were asked to give their 
names (to allow degree results to be obtained subsequently), but they returned the 
questionnaires to the investigators in sealed envelopes, with a guarantee that depart- 
mental staff would not see their responses. 


Table 1 shows a breakdown of the sample by discipline. 


RESULTS 
The analyses were designed to investigate the following questions: 


(1) What differences in approaches to studying and course perceptions exist 
between departments of the same discipline and between disciplines ? 

(2) Do the second-order factor structures of both sets of scales reappear in this 
larger, national sample? 


P. RAMSDEN and М. J. ENTWISTLE 


371 


FIGURE 1 


SUB-SCALES CONTAINED IN THE QUESTIONNAIRE 


Sub-scale 


Meaning 








APPROACHES TO STUDYING 


Meaning Orientation 
Deep Approach 
Inter-relating Ideas 
Use of Evidence 
Intrinsic Motivation 


Reproducing Orientation 


Surface Approach 
Syllabus-boundness 
Fear of Failure 
Extrinsic Motivation 





Active questioning in learning 
Relating to other parts of the course 
Relating evidence to conclusions 
Interest in learning for learning's sake 


Preoccupation with memorisation 

Relying on staff to define learning tasks 
Pessimism and anxiety about academic outcomes 
Interest in courses for the qualifications they offer 








Achieving Orientation 

Strategic Approach 
Disorganised Study Methods 
Negative Attitudes to Studying 
Achievement Motivation 





Awareness of implications of academic demands made by staff 
Unable to work regularly and effectively 

Lack of interest and application 

Competitive and confident 





Styles and Pathologies 
Comprehension Learning 
Globetrotting 

Operation Learning 
Improvidence 





Readiness to map out subject area and think divergently 
Over-ready to jump to conclusions 

Emphasis on facts and logical analysis 

Over-cautious reliance on details 





PERCEPTIONS OF COURSES 


Formal Teaching Methods 
Clear Goals and Standards 
Workload 

Vocational Relevance 


Lectures and classes more important than individual study 
Assessment standards and ends of studying clearly defined 
Heavy pressures to fulfil task requirements 

Perceived relevance of course to careers 





Good Teaching 
Freedom in Learning 
Openness to Students 


Good Social Climate 


Well-prepared, helpful, committed teachers 
Discretion of students to choose and organise own work 
Friendly staff attitudes and preparedness to adapt to students’ 


п 
Quality of academic and social relationships between students 


TABLE 1 


BREAKDOWN OF SAMPLE BY DISCIPLINE AND SUBJECT AREA 




















Number of Number of 
Discipline departments students 
English 9 282 
History 7 209 
Arts 16 491 
Economics 12 450 
Psychology 14 402 
Social Science 26 852 
Physics 11 357 
Engineering 13 508 
Science 24 865 


372 Effects of Academic Departments 


(3) Does the factor structure for the combined sets of scales suggest links between 
approaches to studying and course perceptions ? 

(4) Are these factors general, appearing consistently in different subject areas ? 

(5) Are the ‘approaches to studying’ scales effective in predicting self-rated 
academic progress ? 

(6) Are differences in students' orientations to studying associated with any 
particular course perceptions sub-scales? 


Sub-scale means 

Table 2 shows the mean of the 66 departmental mean scores on each of the sub- 
scales. It is clear that, although there are marked differences between the disciplines, 
there remain wide variations between departments within the same discipline. These 
variations are shown by ranges, rather than standard deviations, in view of the small 
number of departments in each discipline. 


Factor structure of * approaches to studying’ inventory 

The SPSS program (Nie et al., 1975) was used to carry out principal factor 
analyses initially. Aniterative procedure was used to provide communality estimates, 
while the number of factors extracted was determined by eigenvalues (> 1:0). Oblique 
rotation to Thurstone's criteria for simple structure provided the final factor structure 
loadings reported. Loadings- 0-25 were taken as salient. 


Table 3 presents the factor structure formed from an analysis of the approaches 
to studying sub-scale totals for all 2208 students (weighting factors were used to 
compensate for the different numbers of students in each of the six disciplines). 
School examination performance (sum of the best three A-level or best five Scottish 
Highers grades) and self-rating of academic record in higher education were added to 
the correlational matrix. Four factors had eigenvalues greater than one and they 
accounted for 51 per cent of the variance. 


The first two factors are almost identical to those previously described as meaning 
orientation and reproducing orientation. The previous third factor of achieving 
orientation is divided into two. Factor III has its highest loading on disorganised 
study methods and negative attitudes to studying, a factor which had emerged from 
an earlier inventory of motivation and study methods (Entwistle, 1975). This factor 
represents disorganised and dilatory approaches to studying. Factor IV is closer to 
the previous * achieving orientation? with high loadings on strategic approach and 
both extrinsic and achievement motivations. There is also an apparent readiness to 
adopt either deep or surface approaches, which is consistent with a previous finding 
(Entwistle ет al., 1979) that students with an achieving orientation will seek high grades, 
using meaningful or rote learning, whichever seems to produce the best results. 


Factor Ш (Disorganised and Dilatory) shows the highest (negative) loading on 
self-rating of academic progress. As expected, meaning orientation is positively 
related to achievement, while the reproducing orientation shows a negative relation- 
ship. Surprisingly, the achieving orientation itself shows only a slight association 
with the self-rating of academic progress. However, all these relationships will have 
to be checked subsequently, once a more satisfactory criterion of achievement (degree 
class) is available. 


Given the rather weak average relationships between A-level grades and univer- 
sity performance, it is not surprising to find only small factor loadings on school 
examination results. The raw correlations between admissions grades and academic 
progress show remarkable similarity to those found between A-level grades and 
degree results in several previous studies. The correlation is highest (г = 0-24) in 
science and lowest (0-10) in social sciences. This pattern of relationship helps to 
substantiate the use of self-ratings of progress at this stage of the investigation. 


ТАВЕЕ 2 
MEANS ОЕ SUB-SCALES AND RANGES OF DEPARTMENTAL MEAN Scores BY DISCIPLINE 


Psychology Economics Physics Engineering 


Mean Range 


History 


Mean Range 


English 








Mean Range Mean Range 


Mean Range 


Mean Range 


Sub-scale 
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TABLE 3 


FACTOR ANALYSIS OF APPROACHES TO STUDYING SCALES FOR TOTAL SAMPLE 
(N = 2208) 


Factors (51% variance explained) 




















Variables ^l an. uU WwW 
Academic Performance 
School : (—02) (—13) (—15) (—07) 
Higher Education 31 —26 —39 (19) 
Approaches to Studying 
Deep Approach 70 (22) 
Inter-relating Ideas 65 
Use of Evidence 54 (23) 
Intrinsic Motivation 72 —25 
Surface Approach 57 36 30 
Syllabus-boundness —41 58 (24) 
Fear of Failure 50 34 
Extrinsic Motivation —25 38 53 
Strategic Approach 29 48 
Disorganised Study Methods —25 50 
Negative Attitudes to Studying —39 52 
Achievement Motivation Q4) 45 
Comprehension Learning 55 (—24) 30 
Globetrotting 52 
Operation Learning 62 44 
Improvidence 68 (24) 26 
Eigenvalues 3-74 2:55 1:86 1:07 
Percentage Extracted Variance 21 14 10 6 


FACTOR PATTERN CORRELATIONS 


1 2 3 
Fl 
F2 —17 
F3 —14 27 


F4 16 35 —13 

Decimal points and loadings less than 0:25 omitted. 
Factor І | Meaning Orientation. 
Factor П Reproducing Orientation. 


Factor ПІ Disorganised and Dilatory. 
Factor ГУ Achieving Orientation. 


Factor structure of course perceptions questionnaire 

Table 4 indicates the groupings of the course perceptions scales. Factor I con- 
tains the sub-scales of vocational relevance, formal methods, and clear goals and 
standards, while the second factor describes favourable evaluation, with the highest 
loadings on good teaching, openness to students and freedom in learning. Good 
social climate shows moderate loadings on both factors, while heavy workload appears 
in neither of them. These groupings are similar to those obtained in previous analyses, 
except that the first factor previously also contained negative loadings on formal 
teaching and heavy workload. The large sample in the current analysis, however, 
makes this the definitive grouping of scales. 
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TABLE 4 
FACTOR ANALYSIS OF Er PRSS SCALES FOR TOTAL SAMPLE 


Factors (56 У; variance explained) 














Variables I II 
Formal Teaching Methods 71 
Clear Goals and Standards 57 30 
Workload (—24) 
Vocational Relevance 72 
Good Teaching 76 
Freedom in Learning 57 
Openness to Students 76 
Good Social Climate 32 42 
Eigenvalues 1:90 2:53 
Percentage Extracted Variance 24 32 


Correlation between factors is 0-05. Decimal points and loadings less than 0-25 omitted. 


Factor I Formal Vocational Teaching. 
Factor П Positive Evaluation of Teaching and Courses. 


Approaches to studying and course perceptions 

When the two sets of sub-scales are brought together, the factor analyses tend to 
retain the separate identities of the two parts of the questionnaire. Table 5 shows that 
there are four factors relating to approaches to studying, three of which are recognis- 
able as the main orientations. Factors V and УТ are the two course perceptions 
groupings. Although there is not a great deal of overlap between approaches and 
perceptions, what does occur makes good sense. The reproducing orientation is 
clearly associated with a heavy workload, the achieving (strategic) orientation goes 
with perceived clear goals and standards, while the positive evaluation factor (good 
teaching and freedom in learning) shows positive loadings on intrinsic motivation and 
use of evidence. 


Factor V links vocational relevance with extrinsic motivation. Аз this factor 
might be thought to be largely a description of subject area differences, the six factors 
are also shown separately by area of study. 


Area of study differences іп factor structure 

The factor structures for the approaches to studying scales and the course рег- 
ceptions scales are not shown separately by area of study since there is no difference 
in the patterns of loadings. Table 6 shows that the patterns are also similar for the 
combined analyses. Even Factor V (Formal teaching and vocational relevance) is 
recognisable in each faculty. The only marked exception is the more weakly defined, 
and largely uninterpretable, Factor IV. 


Meaning orientation (Factor I) retains its emphasis on syllabus-freedom and its 
stylistic component of comprehension learning across subject areas, although this 
component is much weaker among arts students. This general approach to studying 
is related to good teaching, freedom in learning, clear goals and standards, and less 
reliance on formal methods of instruction (implying, perhaps, greater use of discus- 
sion methods). There is, however, a suggestion here of an area of study difference. 
Freedom and non-formal methods are related to meaning orientation more strongly 
in the sciences and social sciences, while good teaching and clear goals and standards 
show higher loadings in the arts departments. Meaning orientation shows a positive 
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TABLE 5 


FACTOR ANALYSIS OF APPROACHES TO STUDYING AND COURSE PERCEPTIONS SCALES FOR TOTAL SAMPLE 


Academic Performance 


School 


Variables 


Higher Education 


Approaches to Studying 
Deep Approach 


Inter-relating Ideas 


Use of Evidence 


Intrinsic Motivation 


(N = 2208) 


Factors (54% variance explained) 





26 (—20) 


ш ТУ У УГ 





29 


—27 —34 39 





Surface Approach 
Syllabus-boundness 


Fear of Failure 


Extrinsic Motivation 


—38 53 








Strategi 
Disorga 


о APP 


roach 
Study Methods 


27 


Negative Attitudes to Studying —28 
Achievement Motivation 





Comprehension Learning 60 
Globetrotting 
Operation Learning 
Improvidence 


65 








Course Perceptions 


Formal Teaching Methods 
Clear Goals and Standards 
Workload 


Vocational Relevance 


45 


52 --32 


—29 —30 
—33 





75 


(—23) 
73 








Good Teaching 
Freedom in Learning 
Openness to Students 


Good Social Climate 





Eigenvalues 


Percentage Extracted Variance 


FACTOR PATTERN CORRELATIONS 


F1 
F2 
F3 
F4 
F5 
F6 


2 


3 4 5 


—06 
—16 —13 
—23 -15 07 





Decimal points and loadings less than 0:25 omitted. 
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TABLE 7 


FACTOR PATTERN CORRELATIONS FOR 
SEPARATE SUBJECT AREAS IN TABLE б 





Science 
1 2 3 4 5 
Е 
F2 12 
ЕЗ —18 06 


F1 
F2 —05 
F3 —09 06 


F6 35 08 —16 05 21 


Arts 
1 2 3 4 5 
ЕЈ 
F2 —15 
F3 —07 13 


F5 05 06 –03 09 
F6 32 -18 -15 10 22 


relationship with self-rating on academic progress in all three subject areas, although 
the highest loading is found in arts. 


Reproducing orientation (Factor IT) is consistently defined in all three faculties 
with only small variations in the factor loadings. It is related to a heavy workload and 
to poor performance (mainly in the arts). The disorganised and dilatory approach is 
particularly associated with the pathology of globetrotting, and shows a negative 
relationship with academic progress in science. It would seem that arts students 
relying on reproductive learning and disorganised scientists rate themselves less highly 
on academic progress—a finding which certainly makes sense intuitively. It is also 
noticeable that Factor VI (positive evaluation of courses) is linked to positive attitudes 
in all three areas of study. 


Prediction of academic progress 

А useful way of determining which scales predict academic progress most effec- 
tively is discriminant function analysis. Extreme groups were formed in terms of 
students who said they were doing * very well ' in their courses (IN = 58) and those who 
said they were performing ‘ badly’ (М = 43). Table 8 shows the coefficients which 
define the discriminant function. The defining variables are essentially organised 
study methods, positive attitudes to studying, a strategic approach, and (to a lesser 
extent) high scores on achievement motivation and deep approach, combined with low 
scores on surface approach and globetrotting. This function placed students correctly 
in their group in 90 per cent of instances. Of course, this level of prediction is likely to 
be an overestimate, due to the circularity involved in using self-ratings of both progress 
and approaches to studying. Nevertheless, it seems probable that the inventory will 
also be found to have high predictive validity in terms of subsequent degree results. 


P. RAMSDEN and М. J. ENTWISTLE 379 


TABLE 8 
DISCRIMINANT FUNCTION ANALYSIS TO PREDICT ACADEMIC 


PERFORMANCE AT UNIVERSITY FROM APPROACHES TO 
STUDYING SCALEs (М = 2208) 


Coefficients and 

















Variables Order of Extraction 
Deep Approach 0-22 (5) 
Inter-relating Ideas —0-10 (11) 
Use of Evidence 0-17 (9 
Intrinsic Motivation —0:13 (3 
Surface Approach —0-21 (4) 
Syllabus-boundness — 0-11 (12) 
Fear ої Failure (not entered) С) 
Extrinsic Motivation —0-18 (10) 
Strategic Approach 0-42 у 
Disorganised Study Methods —0 76 1 
Negative Attitudes to Studying — 0:46 (2) 
Achievement Motivation 0:25 (6) 
Comprehension Learning (not entered) (— 
Globetrotting —0 29 (8 
Operation Learning — 0:24 (7 
Improvidence 0-06 (14. 
% correct assignment to groups 90% 
x: (P<) 65-0 (<0-001) 


Note : Groups defined by self-rating of performance as ‘very well’ (N = 58) and ‘badly’ (N=43) 


The course perceptions scales were also used, separately, to predict self-rating of 
academic progress. The discriminant function was defined mainly by good teaching 
and a light workload; it assigned students to groups with 72 per cent accuracy 
(P « 0-001). 


Prediction of study orientations from course perceptions 

Extreme groups of departments were also formed in order to see whether typical 
orientations could be explained by students' perceptions of their courses. Groups 
were formed by selecting the two highest and two lowest departmental mean scores in 
each discipline, so that each group consisted of twelve departments. One set of con- 
trasting departments was selected by choosing the highest and lowest scoring depart- 
ments on the composite variable * meaning orientation’ (deep approach + relating 
ideas + use of evidence + intrinsic motivation). The other set was formed with the 
variable ‘ reproducing orientation’ (surface approach + syllabus-boundness + fear of 
failure-+-improvidence; the extrinsic motivation scores were not included as this 
scale did not have its highest loading on the reproducing orientation in the factor 
analyses). 


Extreme departments in terms of meaning orientation were predicted best by 
good teaching and freedom іп learning. In fact, using just these two variables, 71 per 
cent of departments were placed in the correct group (Р <0:05). Reproducing orien- 
tation was predicted with 75 per cent accuracy using alleight scales. This discriminant 
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function was defined mainly by heavy workload, vocational relevance, and a lack of 
freedom in learning. 

Finally, a series of analyses of variance and covariance have been carried out to 
examine the effects of different types of departmental context on approaches to study- 
ing. It was hypothesised that departments which were evaluated positively by their 
students would have higher meaning orientation scores than departments which were 
evaluated negatively. Departments which were evaluated negatively would have 
higher reproducing orientation scores than those which were evaluated positively. 


After removing the variance in the two main orientations attributable to disci- 
plines it was found that a composite evaluation variable of good teaching plus freedom 
in learning was significantly associated with meaning orientation (F = 11:04, df 1, 59, 
P<0-01). Another composite evaluation variable of freedom in learning plus light 
workload was related strongly and negatively to reproducing orientation (F — 11-97, 
df 1, 59, P« 0-001). There were no significant interaction effects between disciplines 
and these composite evaluation variables; this indicates that the effect of the evalua- 
tion variables is similar in all the disciplines. 


Another issue which was explored using ANOVA was the relationship between 
positive evaluation of a department, positive attitudes to studying, and organised 
study methods. All these variables were related to academic progress. It was found 
that positive evaluation was significantly related to positive attitudes (Е = 5-37, df 1, 
59, P 0-03) but not to study methods. This result reinforces the factor analysis in 
which positive evaluation (Factor VI) was associated with positive attitudes, but not 
with either organised study methods or achievement motivation, in all three subject 
areas. 


DISCUSSION 


Jt is now possible to speak with confidence about two principal orientations 
towards studying, defined in terms of self-report inventories, which are closely similar 
to Marton's categorisations of deep and surface approaches to reading an academic 
article. The repeated analyses of our own inventory, together with the parallel work of 
Biggs (1978, 1979), clearly indicate the stability and replicability of these two orienta- 
tions. It is also possible to identify an achieving orientation, and there is probably а 
separate dimension which describes organised study methods and positive attitudes to 
studying. This latter dimension has been found here, and previously, to have a 
relatively strong association with academic progress. 


The ultimate goal of this research is, however, to identify ways in which students’ 
approaches to learning may be modified either through appropriate study skills 
courses, or through the course organisation, assessment and teaching methods of 
departments. What can be said now about the effects of academic departments on 
students' approaches to learning and studying? 


First, it is clear that at least the self-rating index of academic progress is related 
strongly to organised study methods, positive attitudes to studying, and to a strategic 
approach combined with high scores on deep approach and low scores on surface 
approach scales. It is also related, but less strongly, to what is perceived as good 
teaching and a light workload. As the analysis is based on two sets of self-ratings of 
individuals, this * explanation of academic progress verges on the tautological. A 
student who perceives himself as successful is presumably more likely to see the work- 
load as at least reasonable and the teaching as satisfactory. The converse is also true. 
Lack of progress can more comfortably be attributed to departmental inadequacies 
than to personal failings. 


So can we say anything about functional relationships between differing academic 
є treatments ’ and students’ approaches to learning? The analyses which were based 
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on departments, rather than on individuals, provide a firmer basis for interpretation. 
The explanation ceases to be tautological, and there is a clear indication that depart- 
ments rated highly on good teaching and freedom in learning have students with 
higher average scores on meaning orientation. Moreover, a positive evaluation of 
departments 1s associated with positive attitudes to studying. As it has already been 
demonstrated that positive attitudes and а deep approach are linked with academic 
progress, а chain of causality, and of potential educational influence, begins to be 
established. It looks as if changes in teaching (good teaching, greater freedom in 
learning and an avoidance of overloading) are likely to move students away from 
surface and towards deep approaches to learning, and also to improved attitudes, thus 
improving the quality, at least, of what is learned. 


Of course these correlational analyses cannot, in themselves, establish causality, 
but other evidence can be used as well. Marton and Säljö (1976b) showed how 
excessively factual questions induced a surface approach to reading, while Fransson 
(1977) found perceived relevance facilitating the deep approach. These were experi- 
mental studies. Moreover, it is also possible within the Lancaster programme to draw 
on interview data. Students described at length how they tackled 1ndividual pieces of 
work, and how their approaches differed between courses and between lecturers or 
tutors. The influence, and directionality, of aspects of relevance in academic content 
and empathetic teaching at the right level is described by many of the students. 


A student of English explained how enthusiastic teaching could develop a positive 
attitude to studying a subject; similarly, a student following an independent studies 
course described the effect of choice over method and content on his attitudes: 


“If they have enthusiasm, then they really fire their own students with the subject, 
and the students really pick it up . . . I’m really good at and enjoy (one course) but 
that’s only because a particular tutor Гуе had has been so enthusiastic that he’s 
given me an enthusiasm for it and now 1 really love the subject.” ^ (student б) 


“ If you're doing independent studies you're obviously interested in what you're 
doing. Therefore you're in a much more relaxed mental state for approaching 
work: I am, anyway, and other people I know in the course are." (student 2) 


Another English student, after describing a deep approach to essay-writing in one 
part of her course, contrasted this with a subject taught in another department which 
she found less relevant: 


“ It’s a bit confusing, (this subject). When it comes to writing essays, because I'm 
not very interested іп it, І tend to rush through the books I’m reading for the 
essays, so 1 don't really understand it when I’ve finished reading. And because 
there's such a lot of information I think you can either oversimplify or get into too 
much detail. I think I tend to oversimplify." (student 31) 


The following two extracts illustrate the effects of teaching on students' levels of 
approach. А Physics student, describing a problem he had just attempted, said: 


** І was trying to relate it to the notes I'd got in the lectures, but I don't think I 
understood, I didn't get a grasp of what was physically going оп... when the 
courses don't relate much to each other, I find it very difficult to sort of look 
round a question . . . We had a good tutor towards the end of last year, and he 
tried to show the relationship between the courses, how they linked in and were 
concerned with one main theme . . . it really seemed to fit in beautifully.” 
(student 11) 
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While a Psychology student pointed to 


* the lack of empathy that some of the staff have about the ability levels of the 
students relative to their subject. The concrete knowledge that (we) have is 
virtually nil in some of the areas that we're talked at, at a very high level. So you 
can't attach anything that you've been told to something that you already know, 
which of course is a very important point in learning... I think it’s the overall 
problem of the experts coming in and having to give courses in a few weeks on 
their particular interest, and they have such a wealth of knowledge in that area 
that they start at too high a level. That's what I think happens. They've gone so 
far into their own area that they’ve forgotten that we know nothing, essentially, 
compared with them.” (student 7) 


Of course, these illustrations of how tutors and departments affect students’ 
opportunities to learn and study effectively should not be taken to imply that individual 
psychological differences are not important. Students begin their courses with pre- 
existing, and widely differing levels of ability, motivation, and study skills. What has 
been suggested, however, it that the approaches students adopt are to some extent 
shaped by the teaching, the assessment, and the course organisation. Departments 
thus do have responsibility for the efficiency of learning achieved by their students. 


What might be done to help students? Study skills courses with a greater 
emphasis on matching strategies to specific tasks are one possibility. But in this paper 
the emphasis has been on the effects of good teaching, freedom in learning, and an 
appropriate workload. The meaning of the first two terms can be seen more clearly by 
looking at the defining items of these subscales. 


Good teaching 
Staff here make a real effort to understand difficulties students may be having 
with their work. (Correlation with scale total: 0-71). 


The lecturers in this department always seem ready to give help and advice on 
approaches to studying. (0-69). 

Lecturers in this department seem to be good at pitching their teaching at the 
right level for us. (0-65). 


Freedom in learning 
We seem to be given a lot of choice here in the work we have to do. (0:74). 


Students have a great deal of choice over how they are going to learn in this 
department. (0:72). 
These items give some idea of how teaching and courses might be improved in order to 
facilitate learning. They do not, however, allow us to make specific suggestions about 


changes to the organisation of teaching and learning in the departments. In order to 
do this, more detailed case studies of individual departments will have to be carried out. 
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THE LEARNING PROCESSES OF AUSTRALIAN 
UNIVERSITY STUDENTS: INVESTIGATIONS OF 
CONTEXTUAL AND PERSONOLOGICAL FACTORS 


By D. WATKINS 
(Australian National University, Canberra) 
AND J. HATTIE 
(University of New England, NSW, Australia) 


SuMMARY. Two studies are reported which investigate sex, faculty, and age (academic 
year) differences in the study methods of students at an Australian university. Significant 
main effects were found, but there was little evidence of any interactions. Correlations 
with grade point average indicated that success in Science-based faculties was related to 
using a deep-level approach to study relatively infrequently adopted by these students. 
It would seem that it was the young students, the male students, and the students 
enrolled in Science-based faculties who were most in need of study methods counselling. 


INTRODUCTION 


THE last few years have seen a revival of research interest into the study methods of 
tertiary students. This has come about partly through the epparent inability of non- 
intellective variables (such as motivation, attitudes and personality) to improve 
prediction of tertiary performance much beyond the level provided by intellective 
variables (such as IQ and college entrance tests) alone. As the latter class of variables 
typically account for only 20 to 30 per cent of the variance of grade point average there 
is considerable room for improvement. 


Many tertiary institutions provide study skills courses. Until recently most of 
these courses and much of the research in this area had worked from the attractive 
but naive assumption that there is such a thing as a ‘ good’ method of study. Such 
behaviour as taking careful lecture notes, summarising the important points presented 
in lectures and textbooks, setting regular time aside for study free from distractions, 
etc., was assumed to characterise the successful scholar. Unfortunately research has 
found that quite a few successful students do not waste their time with ‘ good ' study 
habits while failing students often possess apparently ideal study methods (Lafitte, 
1963; Maddox, 1963). Reviews of this literature and their own intensive work in 
this area have forced researchers such as Biggs (1978) and Entwistle et al. (1971) to 
conclude that all proficient students do not follow the same path to success. 


Much of the recent research has focused on those factors which predispose a 
student to adopt a particular approach to study. It would appear that certain 
psychological characteristics—such as being a ‘ divergent’ or ‘ convergent’ thinker 
(Hudson, 1968; Parlett, 1970); being tolerant or intolerant of ambiguity (Biggs, 
1970a); being highly or not highly anxious (Stringer et al., 1977) predispose the 
individual to prefer a particular approach to study. There is also some evidence that 
males and females may benefit from different approaches to study (Biggs, 1976). 

Other researchers have emphasised the context in which the learning takes place 
and the content of the learning task itself (Lavin, 1965). Thus it would appear that 
different approaches to study might be differentially effective (a) in Arts and Science 
subjects (Biggs, 1970b; Goldman and Warren, 1973); (6) in objective and essay 
tests (Biggs, 1973); and (c) depending on the method for combining marks for the 
final evaluation (Biggs and Braun, 1972). 

Ramsden (1979) has demonstrated that students in different departments perceive 
themselves to be studying in very different contexts and consequently tend to adopt 
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different study strategies. Tt would seem logical that the longer such students spend 
in a department the more they become socialised into the learning environment of 
that department. In addition & senior student would normally be studying more 
advanced topics and is more likely to be expected to demonstrate an independent study 
capacity. Thus it would be expected that a first year and a senior year student may 
vary considerably in their approach to study. 


The realisation of the complexity of this field of study has influenced some 
researchers to reject the traditional * quantitative ’ psychometric approach (exemplified 
by much of Entwistle's and Biggs’ work) with its use of highly structured question- 
naires and sophisticated statistical procedures to identify consistencies in students' 
approaches to learning, in favour of a * qualitative ' approach (Marton and Svensson, 
1979). 'This latter research method essentially involves looking at how students 
979) Ly learn through intensive interviews and case study techniques (e.g., Laurillard, 
1979). 


As Entwistle and Hounsell (1979) point out, the " qualitative " and the * quanti- 
tative’ approaches are essentially complementary in nature. The former provides 
opportunities to explore and probe the study process domain in a relatively uncon- 
strained manner. If such opportunities are grasped a conceptually rich and accurate 
description of student learning should be forthcoming. However, there is always 
some doubt as to the validity and generalisability of such findings. The quantitative ’ 
approach inevitably restricts consideration to a set of inventory items determined by 
the researcher and forces the students to report a general approach to learning—thus 
over-emphasising the consistency of their study behaviour. Yet this approach does 
have the advantage of providing empirically verifiable, quantitative estimates of the 
strength of relationships between different aspects of the study process complex. 
The writers would argue that, at this early stage in this field of study, there is con- 
siderable room for further investigation from both research perspectives. 


THE RESEARCH 


It is clear that the relationship between contextual and personological factors 
and study methods is a complex one, requiring further investigation. Interactions 
between these factors have rarely been studied in a systematic, research-oriented way. 
The purpose of the present paper is to report two studies which explore, from a 
multivariate perspective, tbe relationships between the approach to study adopted 
by students at one Australian university and their sex, faculty, and academic year 
(age in Study ID. The second study will also compare the relationships of tertiary 
achievement, as measured by Grade Point Average (GPA), with the study methods 
of students in different faculties. 


The setting of our research was the University of New England (UNE). The 
four major UNE undergraduate faculties were the focus of our work: Arts, Science, 
Rural Science, and Economics. It has been shown elsewhere that students in these 
different faculties perceived their academic environments differently (Watkins, 1978) 
and that students of differing personality types were attracted to and satisfied by the 
different faculties (Watkins, 1977). 


The instruments used in our research and which are described later in this paper 
were, for Study I, the Biggs’ (1976) Study Behaviour Questionnaire (SBQ) while, for 
Study П, the Biggs’ (1979) Study Process Questionnaire (SPQ) and the Inventory of 
Learning Processes (ILP) (Schmeck et al., 1977). These inventories are recently 
developed examples of the type of multi-faceted instrument required to explore such a 
complex area, in contrast to earlier inventories such as the widely used Survey of Study 
Habits and Attitudes (Brown and Holtzman, 1955) which assumed that only one type 
of " good’ study methods existed. 
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Stupy І 


Method 

Biggs developed the SBQ as a means of operationalising the study process 
domain. Following the Lewinian approach, he conceived tertiary performance to be 
influenced by personality and institutional factors via the study process complex. 
The items of the SBQ represent, in the main, attempts to operationalise those person- 
ality variables which Biggs’ literature survey indicated may influence the student’s 
approach to academic work. The version of the SBQ studied here was developed 
after much item analysis, factor analysis and validity work. It has 10 unidimensional 
scales as outlined in Table 1. 

TABLE 1 


Tue SBQ Scares (Вісс8, 1976 Version) 





1. Pragmatism (10 items) 
Grade oriented; student sees university qualifications as a means to some other end. 
2. Academic motivation (10 items) 
Intrinsically motivated; sees university study as an end in itself, 
3. Academic neuroticism (7 items) 
Overwhelmed and confused by demands of course work. 
4. Internality (8 items) 
Uses internal, self-determined standards of truth not external authority. 
5. Study skills (8 items) 
Works consistently, reviews regularly, schedules work. 
. Rote learning (8 items) 
Centres on facts and details and rote learns them. 
7. Meaningful learning (8 items) 
Reads wy and relates material to what is already known; oriented to understand all input 
material. 
8. Test anxiety (6 items) 
Worries about tests, exams, fear of failure. 
9. Openness (8 items) 
Student sees university as a place where values are questioned. 
10. Class dependence (7 items) 
Needs class structure; rarely questions lecturers or texts, 


с 


The SBQ was included in the annual postal survey of UNE internal, full-time, 
undergraduates carried out by the Educational Research Unit at UNE. Survey 
forms were sent to a one in three sample of the student body and usable responses 
were received from 562--а 60 per cent response rate. While such a response rate is 
typical of such research it must be kept in mind when interpreting the findings of the 
study. Only subjects for whom complete data were available were used in this research. 
Forty-four students had to be eliminated therefore. The final sample consisted of 
518 students (282 males and 236 females). Of these 231 were enrolled in Arts, 132 in 
Science, 41 in Rural Science, and 114 in Economics while 182 were first years, 141 
second years, 137 third years and 58 fourth year undergraduates. 


Results 

Finn's (1977) program was used to perform a multivariate analysis of variance 
of the subjects’ scores on the 10 SBQ scales with respect to sex, faculty and academic 
year. The results are presented in Table 2. All three main effects were significant at 
the one per cent level while none of the interactions reached this level of significance. 

Where there were significant differences due to main effects a step-wise discrimi- 
nant analysis ultimately forcing in all variables was used to determine which variables 
contributed most to separating the groups. The following differences were found: 
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TABLE 2 


RESULTS OF MANOVA OF BIGGS’ SBQ SCALES ACCORDING 
то Sex, FACULTY, AND ACADEMIC YEAR 





Source of variation df F Р 

Sex 10 478 6:25 0-00 
Faculty 30 1404 360 0-00 
Уеаг 30 1404 168 0:01 
Sex x Faculty | 30 1404 1:05 0:39 
Sex x Year 30 1404 1:47 0:05 
Faculty x Year 90 3252 1-15 0-16 


Sex. Females scored significantly higher on motivation, study skills and openness 
and males scored higher on pragmatism, neuroticism, and dependence. In Biggs' 
terminology the females were more internalising and the males more reproducing. 


Faculty. Arts students scored highly on the motivation, internalising, meaning, 
and openness scales, whereas Science students scored highly on the pragmatism and 
fact-rote scales. Rural Science students were more worried, dependent and had more 
organised study skills whereas Economics students were more pragmatic, test anxious 
and dependent. 


When only Arts and Science students were compared, Science students were 
discriminated from Arts students by high scores on fact-rote, pragmatism, neuroticism 
and study skills. Thus, Biggs’ first factor, reproducing, was the discriminant between 
Arts and Science students with the latter being more oriented towards reproducing 
strategies. This finding is in accord with Biggs (1978). 


Year. As was predicted, our results clearly indicated that the more years of 
university study the less likely was a student to use a systematic study method, but 
the more likely to use internalising and open strategies—the deep level approaches 
to study. 


5тору II 


Study I indicated that there were sex, faculty, and academic year differences in 
the study processes adopted by UNE students. These factors were further explored 
inthisstudy. One problem with Study I, however, was that it could not be determined 
if the differences apparently due to senior academic years were in fact due to maturity 
(age and academic year baving been confounded in that research). Therefore in 
Study II the sample was restricted to first year students and sex, faculty and age 
differences were investigated. In addition the relationships between study methods 
and academic achievement were examined with respect to faculty to investigate the 
predictive power of the study method inventories and to see if students were adopting 
the study methods found to be most successful for others of their group. Two more 
recently developed study process inventories were also utilised in this study. 


Method 

The subjects were 249 UNE first year internal undergraduates—once again a 
60 per cent response rate. Of this number 138 were male and 111 female; 113 were 
enrolled in Arts, 53 in Science, 22 in Rural Science, and 61 in Economics; 65 were 
18 years of age, 97 were 19, 26 were 20, and 61 were 21 or over. 


One of the inventories used was the Biggs (1979) Study Process Questionnaire, 
an updated version of the SBQ. The SPQ is based on the proposition that students 
tend to have several broad motives for studying and several broad strategies for going 
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about their work. Based on his earlier research, Biggs considers the three most 
important motive/strategy dimensions to be the following: 


1. Utilising 

Motive: to undertake further study as a means of obtaining a better job, 
more money, or some other extrinsic need. 

Strategy: overall, simply to avoid failure and specifically to focus on 
minimal content, primarily factual as prescribed in class handouts, course 
outlines, etc., and to rote learn this necessary minimum for reproduction in 
examinations and/or assignments. 


2. Internalising 
Motive: to work out one's philosophy of life and to develop special interests 
and abilities; studies are selected therefore that hold maximum intrinsic interest. 
Strategy: to read widely and with maximal understanding (independently 
of course requirements), to integrate various subjects and make them personally 
meaningful. 


3. Achieving 
Motive: to excel in studies as part of a general competitive approach to life 
and win high status thereby; more specifically, to study with a view to maxim- 
ising grades awarded. 
Strategy: close orientation to course outlines, work schedule tightly 
organised, assignments completed on time, etc. 
(from Biggs, 1979, p. 2) 


The SPQ consists of 42 items each tapping one of the three broad dimensions 
presented above and each divided into motive and strategy sub-scales of seven items 
in length. 

The other measuring instrument was the Inventory of Learning Processes. 
Schmeck et al. (1977) have developed the Inventory of Learning Processes (ILP) to 
assess individual differences in some of the information processing habits shown to 
be important in laboratory studies of human Jearning. The ILP consists of 62 items 
divided into four scales: Synthesis-Analysis (which assesses meaningful as opposed 
to superficial information processing); Fact Retention (which assesses attention to 
details and specifics as opposed to generalities); Elaborative Processing (which 
assesses elaborative as opposed to verbatim processing strategies); and Study Methods 
(which assesses repetitive, drill and practice habits of processing information). 


Results 

Finn’s (1977 MULTIVARIANCE program was used to investigate mean 
differences according to sex, faculty, and age. The Biggs and the Schmeck её al. 
sub-scales were analysed separately. 


As Study I had indicated the sources of variance of most interest, various optional 
planned contrasts were specified here. The first related to differences between the 
responses of males and females. The second set related to faculty differences, while 
the third related to age differences (see Table 3). 


Given that а non-orthogonal design was used, the between-group effects were 
re-ordered to test the main effects of interest first, then interactions of interest; and 
finally other 2- and 3-way interactions. АП 2-way and all 3-way interactions not of 
interest were pooled. Table 3 presents a summary of the results from the MANOVA 
and Table 4 presents the means of the scale scores for both inventories according to 
sex, faculty and age. Because of the number of statistical tests performed, the 
х == 0-01 level was used to establish statistical significance. 
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TABLE 3 
RESULTS OF MANOVA OF BIGGS AND SCHMECK ET AL. INVENTORIES ACCORDING TO SEX, FACULTY, 
AND AGE 
Biggs’ SPQ Schmeck ег al.’s ПР 
Factor Contrast df F Р df F P 
Sex Male vs Female 6 215 3:84 0:00 4 217 4-24 000 
Faculty Science vs Arts 6 215 10:30 0-00 4 217 5:69 0-00 
Science ys ES 6 21 4-13 0-00 4 217 1-02 0:40 
Science ys RS 6 215 2.71 0-01 4 217 1:97 0-10 
Age 18 vs 20 6 215 0-65 0-69 4 217 2-00 0-10 
18 vs >21 6 215 664 0-00 4 217 5-05 0-00 
18 vsz: 19 6 215 2:39 0-03 4 217 4:08 0-00 
Interactions Sex vs Arts/ 
Science 6 215 062 071 4 217 4:84 0-00 
Sex vs 18/219 6 215 0-70 0:65 4 217 0-75 0-56 
Other 2-way 78 1192 084 0-83 52 843 0-81 0-82 
Other 3-way 36 947 1:33 0-09 24 758 2:20 0-00 


It can be seen that there were significant differences in the mean vectors on sex, 
Arts vs Science, 18 vs > 21 for both sets of tests, and Science ys Economic Studies for 
the Biggs subset, and for 18 vs > 19 on the Schmeck ef al. subset. Two interactions 
were also significant on the latter inventory. 


It has been recommended often that if the tests for interaction are significant 
then no tests of main effects are appropriate (Cramer and Appelbaum, 1980). Yet it 
may be meaningful to investigate certain contrasts in the cell means to aid in interpre- 
tation of results. If there are no interactions then one method we can use to investigate 
differences is by looking at the univariate F tests. 


There were three significant variables on the Biggs’ subset that discriminated 
between males and females. Males were higher on utilising strategy and females 
higher on internalising motivation and internalising strategy. Comparing Arts and 
Science students, Science students tended to use utilising strategies whereas Arts 
students were more likely to report internalising motives and strategies. Rural 
Science students were more motivated by utilitarian reasons than Science students, 
and Economics students were significantly less internalising than Science students. 
Those students 21 years of age and over were less utilitarian motivated and more 
internalising (with respect to both motive and strategy). 


For the Schmeck et al. inventory, females used organised study methods more 
than males. Students in Arts were more inclined than those in Science to deep-level 
processing, scoring more highly on both the Synthesis-Analysis and Elaborative 
Processing scales. There was an overall trend for older students to depend relatively 
more on elaborative processing and synthesis-analysis—the deep level approaches. 


No significant 2-way interactions were found. There were 3-way significant 
interactions among contrasts on the Schmeck её al. inventory. The MANOVA 
program was re-run specifying single degree of freedom contrasts on the 3-way 
interactions. As this violates assumptions regarding the independence of significance 
tests this tactic must be regarded as exploratory. The results were difficult to interpret 
intelligibly. 

Academic achievement. The correlations found between academic achievement 
(as measured by GPA) and the study process inventory scales for each faculty are 
presented in Table 5. With GPA as the dependent variable and the Biggs and Schmeck 
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et al. inventory scales as the independent variables, the following multiple Rs were 
obtained: 0-60 for Arts, 0:75 for Science, 0:72 for Rural Science, and 0:65 for 
Economics. Thus these instruments were quite useful predictors of academic achieve- 
ment. This conclusion may be of particular importance given earlier research at 
UNE which indicated that study methods were not related to college entrance scores. 


However, it is evident that there were faculty differences in the relationship with 
GPA. To have an intrinsic interest in the subjects being studied and to use deep-level 
processing methods as exemplified by the Schmeck et al. Synthesis-Analysis and 
Elaborative Processing scales is apparently a factor in the success of students in all 
faculties. Yet the Biggs’ Internalising Strategy scale was only significantly related 
to achievement in Arts students. This situation may simply reflect differences 
between the Biggs and Schmeck et al. scales. However, the much larger correlations 
with academic success found in the Science-based faculties (but not Arts) for the 





TABLE 5 
CORRELATIONS OF BIGGS AND SCHMECK ET AL. SCALES WITH GPA ACCORDING TO 
FACULTY 
Rural 
Arts Science Science Economics 
Inventory Scales (N = 113) (N = 53) (N = 22) (N = 61) 
Utilitarian Motivation —0:17 — 0:39* —0-46* —0-01 
Utilitarian Strategy —0 08 -0 40* —0:52* —0-14 
Internalising Motivation 0-40* 0-26 0-38 0:30* 
Internalising Strategy 0:24» 0:07 0:15 0:00 
Achievement Motivation 0:15 — 0:05 —0:13 0:09 
Achievement Strategy 0-31* 0-09 0-11 0-17 
Synthesis-Analysis 0.28* 0:61“ 0:56“ 0 47“ 
Elaborative Processing 0-24* 0-25 0-27 0-28* 
Fact Rote 0-12 0-30* 0-07 0:25* 
Study Methods 0:42“ 0:09 0:13 0 43“ 
*Р<0-05 


Schmeck et al. Synthesis-Analysis relative to the Elaborative Processing scale tend 
to support Marton and Sáljó's (1976) argument that the meaning of the concepts of 
deep and surface levels of study may differ in different contexts, such as different 
subject areas. 


Organised study methods were particularly beneficial to Arts and Economics 
students, while scores on Biggs’ Utilising Motivation and Strategy scales correlated 
significantly negatively with achievement in the Science-based faculties—suggesting 
that the ‘ minimax’ reproductive study methods are not likely to be sufficient for 
academic success in these faculties (though they may well be a necessary condition 
fulfilled by virtually all Science students). 


CONCLUSIONS 


The two studies reported here have tried to explore in a systematic way the 
relationship between the study methods adopted by students at one Australian 
university and various contextual and personological factors. By using a multi- 
variate approach it was possible to examine interactions between these variables in a 
way not attempted in most earlier research. The results have pointed to the existence 
of main order effects of sex, faculty, and age rather than to high order interactions. 
In particular, evidence for differences between the study processes of students was 
found according to the factors set out below: 
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Sex. Regardless of faculty, academic year or age the females were more likely 
than the males to show interest in their courses and to adopt a deep-level approach to 
their work. At the same time the females also generally seemed to possess more 
organised study methods than the males. The males were more likely to have a 
pragmatic approach to tertiary study, to be more worried about their work, and to 
adopt reproducing strategies which would allow them to scrape through their exami- 
nations. On the basis of these findings it would be expected that females would have 
better academic results than males. Indeed studies of academic progress at UNE 
have shown females to have higher average marks and higher graduation rates than 
males. 


Faculty. Regardless of sex, academic year or age Arts students were the most 
likely to show intrinsic interest in their courses and to adopt a deep-level approach 
to their work. Science students tended to be relatively more motivated by vocational 
concerns and to adopt surface-level reproductive study methods. Rural Science and 
Economics students, too, were more likely to adopt surface-level strategies and were 
apparently more anxious and dependent. That such students also tended to have 
stronger utilitarian motives is not of course surprising given the professional relevance 
of their courses. 

Age. Regardless of sex, faculty, or indeed academic year, the more mature 
students tended to be less motivated by pragmatic concerns and to be more liable to 
adopt a deep-level approach to their work. This result may be of significance given 
the increasing number of mature-age students enrolling at Australian universities and 
supports the contention that this new clientele may require different teaching methods 
to those students straight from school (Ноге, 1978). However, it would appear that 
it is the older students who are more likely to use study methods most conducive to 
academic success. To what extent this result is due to intellectual maturation or to 
changes in school teaching methods in recent years would require further research 
to determine. 

In general, our results are consistent with those reported in the literature discussed 
earlier. However, our findings would indicate that in all faculties males (irrespective 
of age or academic year) and younger students (irrespective of sex or academic year) 
are more inclined to be pragmatically motivated and to adopt reproductive study 
methods-—approaches to study which are negatively correlated with academic success, 
especially in Science-based subjects in which the majority of such students are enrolled. 
This would indicate that much more is involved in the choice of study methods 
adopted by students than simply the context of learning. Students are not adopting 
the study methods most likely to lead to academic success in the particular courses 
they are studying. А detailed analysis of the academic tasks facing students in 
different courses and the appropriateness of the teaching methods adopted (both at 
school and at university) may lead to a fuller understanding of this finding. 

The tendency of personality factors to be related to study methods also suggests 
the importance of personological as well as contextual factors. Of course, however, 
we did find evidence for the relationship of faculty and the study method adopted by 
students, independently of sex or age, and there are other contextual variables which 
may be of importance not considered in this research (e.g., methods of instruction, 
type of assessment). 

Clearly there is a need for far more research in this area, from both a * quanti- 
tative ' and ‘ qualitative ' perspective, before we can feel confident that we understand 
why students adopt a particular approach to their studies. While there must always 
be some doubt about the generalisability of findings from one institution (and with 
response rates of 60 per cent at that), our results would suggest that it 15 the young 
and the male students, particularly in Science-based faculties, who tend to be most in 
need of study methods counselling. 


. 
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RESEARCH NOTE 
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PERSONALITY AND ARITHMETIC OF NORMAL SCHOOL PUPILS AND 
BOYS IN A COMMUNITY HOME WITH EDUCATION 


Bv P. WATSON 
(Leeds University) 


SUMMARY. Relationships between personality and mechanical arithmetic test measures in 
118 eleven to twelve-year-old normal school pupils were studied. Speed was found to be related 
to attitudes, mainly Jesness Inventory subscores, and accuracy to cognitive styles. Specifically, 
errors 1nvolving carelessness, including those due to ‘set’, were related to Impulsivity on Kagan’s 
test. Then CHE boys were found to differ from the normal pupils in terms of Jesness scores and 
Impulsivity. They attempted fewer arithmetic items and, on a more difficult test, were markedly 
less accurate than normal pupils. They made proportionately more 'set-errors'. 

Lower arithmetic attainment scores of CHE boys are not wholly attributable to lack of 
knowledge but are related partly to attitude and partly to Impulsivity. 


INTRODUCTION 


If there is a relationship between personality variables and attainment in simple arithme- 
tic in normal school children and if the same personality variables differentiate boys in a 
‘Community Home with Education on the Premises’ (the CHE boys) from normal children, 
then this may add to the understanding of the arithmetic of CHE boys. In the project 
designed to test this, the choice of tests was determined by previous work relating personality 
to arithmetic (Watson, 1978) and work comparing normal children and CHE boys (Berry, 
1971; Saunders and Davies, 1976). 


METHOD 


The sample of ‘normal’ children consisted of the year group of 56 boys and 62 girls 
aged 11 to 12 in an urban middle school. All 55 boys in а CHE were given the tests. Their 
ages covered the secondary school age range. Thirteen were of the same age range, 11 to 12, 
as the normal school sample. 


The tests 

Each pupil completed: the Junior Eysenck Personality Inventory (ЈЕРО, the Jesness 
Inventory, Raven's Matrices, Witkin's Group Embedded Figures Test (GEFT) and a 
modified version of Kagan's Matching Familiar Figures Test (МЕЕТ). In this latter, each 
pupil selected, on each page of the MFFT booklet, the figure he/she considered identical to 
the standard one and recorded it. As soon as the whole booklet was finished, the pupil's 
total time was recorded, as was the number of errors the pupil made. Thus, the time and 
error measures used were not identical with those in other studies. 


The pupils were also given two mechanical arithmetic tests. One examined addition and 
subtraction, the other, all four basic arithmetic processes. Items in both were in а non- 
progressive order and involved only integers. In addition to the gross attainment scores, 
elements of these attainment scores were examined. So, for each arithmetic test, four scores 
were obtained: the pupil's speed, i.e., the number attempted in a Jimited time, the number of 
errors, the total-correct score and accuracy, i.e., number correct/number attempted. Before 
the testing started, pupils listed their school subjects in order of preference. 


As а subsidiary enquiry, arithmetic errors were examined in detail. They may be sorted 
into several categories (Engelhardt, 1978). Here two sub-groups of errors attributable to 
carelessness, rather than to lack of understanding, were studied. These are errors due to 
‘set’ (e.g., a pupil does three subtraction sums and the next, which is addition, is tackled as if 
it were subtraction) and errors in basic addition or subtraction number bonds. Engelhardt 
has shown that pupils with high Impulsivity scores make more number bond errors than other 
pupils. Because 'set-errors' occurred in the responses of about three-quarters of the pupils 
to the simpler arithmetic test, but in the responses of fewer than one quarter of the pupils to 
the other arithmetic test, they were studied here only in the simpler arithmetic test. 
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RESULTS 
The normal school sample 
With both arithmetic tests, there were significant relationships between Raven's Matrices 
scores and arithmetic test variables. For example, with the simpler test, the speed, error and 
total correct scores had Pearson product-moment correlation coefficients of 0:42, - 0:40 and 
0-54 respectively with the Matrices scores. Therefore in the following consideration of 
other relationships, the effect of the Matrices scores is partialled out. 


As Table 1 shows, on both tests, speed was related to attitudes and errors to cognitive 
style, in particular Impulsivity. 

In the subsidiary study of ‘careless’ types of error, it was found that Impulsivity was 
related not only to number bond errors, as Engelhardt reported, but also to ‘set-errors’. 
Thus, using multiple correlations involving both MFFT scores for Impulsivity and with 


TABLE 1 


PARTIAL CORRELATIONS BETWEEN ARITHMETIC VARIABLES AND PERSONALITY 
VARIABLES CONTROLLING FOR RAVEN’S MATRICES SCORES 


Уо AL MA Pref MFFI(E) MFFI(T) EFT 


Spl —0-20 
Sp)  -028 -036 —018 -022 





Еп 0:32 — 0:43 — 0:19 
Er2 0:28 —035  —021 
Тон -019 —0:21 0-19 
Tot2 —0:32 —0:39 —021 -020 0:27 


The left-hand column variables are speed, errors and totals for arithmetic tests 1 and 2. Across 
the top, the first three are Jesness sub-scales (i.e., Value Orientation, Alienation and Manifest Aggres- 
sion) followed by the position of mathematics in the pupil's subject preference order. The next two 
are Matching Familiar Figures Test error and time scores, followed by Embedded Figures Test 
scores. For relationships where there are no figures in the table, the results were not significant. For 
a sample of this size, correlations of 0-17 are significant at the 5% level. 


TABLE 2 


COMPARISON OF THE SCORES OF NORMAL SCHOOL PUPILS WITH THOSE OF BOTH THE ToTaL CHE 
SAMPLE AND THE ‘SAME AGE’ CHE SAMPLE 








Normal School Total CHE group CHE Same-age Sub-group 
(N = 118) (N = 55) (N = 13) 
Variable Mean SD Mean SD t Р Mean SD t P 

Ex 17.9 37 162 38 28 «0901 160 29 18 NS 
N 13:5 43 132 55 04 NS 157 37 17 NS 
L 40 24 34 23 L5 NS 28 22 1:7 NS 
vo 18-9 6:5 197 72 0-7 NS 230 52 22 «005 
ЕЕТ 11-4 49 125 56 13 NS 125 54 08 NS 
Ravens 38:0 7:5 347 76 27 «091 33:0 81 23 «005 
*MFFT 64 2:5 72 26 81 14 
*MFFT 14:3 41 121 57 120 51 


Variables here are as in Table 1 with the addition of the JEPI Extraversion, Neuroticism and 
Lie scores and Raven's Matrices scores. 

* The MFFT scores comparison was as follows. Using the normal school means, four groups 
were differentiated; ‘Impulsive’ pupils were above average in errors and below average in time score 
and ‘Reflective’ pupils vice-versa, There were two other groups left. Of the ‘same age’ CHE boys, 
sorted into these groups, 10 were Impulsives, 2 other groups and none was Reflective. (One pupil did 
not take this test.) is is significant. 
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Raven's Matrices scores partialled out, set-errors and number bond errors had correlations 
of 0:21 and 0-30 respectively with Impulsivity. 


Comparison of the samples 

The CHE boys had lower Raven's scores (Table 2) than the normal pupils. They ob- 
tained significantly higher Jesness Inventory VO scores and were more impulsive on the 
MFFT than the normal school pupils—among whom, incidentally, boys and girls did not 
differ significantly in Impulsivity. 


CHE arithmetic 

Because of the influence of age on arithmetic performance, the comparison, Table 3, is 
between normal school pupils and CHE boys of the same age. The latter had lower total 
scores on both tests. "There is, however, a difference in how these lower scores were obtained. 
On the simpler test, the CHE boys were slower but not all that Jess accurate than the normal 
school pupils. On the more complex test, while still tending to attempt less, their accuracy 
became significantly lower. 


TABLE 3 


COMPARISON OF THE NORMAL SCHOOL AND THE CHE 
SAME-AGE SAMPLE ARITHMETIC SCORES 


Normal School CHE Same-age Sample 
(N = 13) 





(N = 118) 
Variable M SD M SD t Р 
Sp 1 344 75 238 64 48 <0-001 
Sp2 21:5 85 165 76 20 «005 
Асі 803 165 755 162 10 NS 
Ac 2 710 212 563 259 23 «0905 
Tot 1 283 94 184 73 36 <0-001 
Tot 2 162 87 108 7:0 21 «095 


Here Ac is accuracy, 1.е., number correct/number attempted. 


The results concerning Impulsivity, in Table 2 and in the subsidiary study of types of 
error, suggest that CHE boys would have a higher proportion of ‘set’ and bonding errors than 
normal pupils. This was only checked in the simpler arithmetic because of the extremely 
low frequency of ‘set’ errors on the other test. Comparison of the proportions (Guilford 
and Fruchter, 1973) of ‘set-errors’ to numbers attempted for the normal school pupils 
(6-5 per 100) and the same-age CHE boys (10 per 100) confirmed that CHE boys had a higher 
proportion of set-errors than normal school pupils (Z — 2:26 P «0-02). However, there was 
not a significant difference between the two groups 10 terms of number bond errors. 


DISCUSSION 

The project highlights the importance of attitudes and Impulsivity in mechanical arithme- 
tic. It emphasises particularly the relation of Impulsivity to certain types of errors. From 
a teaching point of view, it should be noted that Impulsivity is reported as being modifiable 
(Barstis and Ford, 1977). 

What of the differences between the samples? JEPI scores did not significantly differenti- 
ate them, whereas the Jesness Inventory scores did (cf. Vallance and, Forrest, 1971). The 
CHE boys were more impulsive. 


The CHE boy scores less on arithmetic tests partly because he attempts less—as might 
be expected from the evidence on attitudes in Tables 1 and 2. He is slower even on simple 
arithmetic on which he is about as accurate as normal pupils. But he scores less also, on 
arithmetic more appropriate for his age, because he gets wrong a higher proportion of those 
which he attempts than normal school pupils. Further, а high proportion of his errors, 
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rather than being errors due to lack of knowledge, are due to carelessness. Thus, his low 
arithmetic scores are not necessarily wholly attributable to lack of attainment (1.е., knowledge) 
but are related partly to attitude and partly to Impulsivity. 


e пеш а would like to thank the pupils, headteachers and staff who took part in 
this study. 
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BOOK REVIEWS 


BENNETT, N., ANDREAE, J., HEGARTY, P., and WADE, B. (1980). Open Plan Schools. 
Slough: NFER, pp. 903, £9-75. 


Whereas Professor Bennett’s previous book (Teaching Styles and Pupil Progress) was 
launched amid the glare of media publicity and has since been the subject of fierce debate, this 
one is relatively uncontroversial. The impetus for it came from what seemed a real problem: 
the fact that by the mid-1970s there had been a rapid expansion of open plan primary schools 
to the point where they represented about 10 per cent of all the primary schools in England 
and Wales, and yet there was no detailed and comprehensive study of what open plan designs 
were like, how such schools were operated, what problems they posed and how teachers were 
reacting to them. Professor Bennett and his team have certainly now filled the gap with a very 
thorough study based on questionnaire surveys to all primary schools considered ta be open 
plan in England and Wales, involving all head teachers and one-third of the teachers, and 
followed up by detailed observation and interview studies in 23 selected schools. 

The main conclusions are that open plan designs fall into two main types: type one with 
shared teaching space in addition to shared practical and/or enclosed areas such as quiet 
rooms and type two with shared practical and/or enclosed areas but no shared teaching space. 
The actual designs themselves, however, are so varied that to describe a school as open plan 
says little about it except that it does not " wholly consist of conventional classrooms. Between 
these varied designs and the way in which teaching and learning is organised there is no 
necessary connection—a point brought out very neatly by case studies five and six in Chapter 
10 which compare two contrasting forms of organisation in schools of identical design. The 
only respect in which design seems to have a consistent effect is that type one designs seem to 
encourage team teaching, especially in infant schools. 

Perhaps the most interesting sections in the book are those on curriculum allocation and 
pupil involvement, which analyse in detail the amount of time devoted in the schools observed 
to different areas of the curriculum and the use which pupils made of that time. These bring 
out strikingly the wide variations in curriculum balance which inevitably go with the relative 
autonomy of the British teacher and the extent to which this can affect the learning experiences 
of the children. They also highlight the fact that in all units between one-fifth and one-quarter 
of the children’s time is spent in activities which could only be classified as administration/ 
transition, without being able (for want of relevant research) to draw firm comparisons with 
conventional schools. Paradoxically perhaps these sections are the ones which seem the least 
clearly associated with the specific problems of open plan schools. Although they provide a 
fascinating picture of what was going on in the open plan units, this appears not to be 
markedly different from what goes on in other primary schools. 

Not surprisingly in such a detailed report a number of minor mistakes have crept into 
the text, e.g., page 66, the column headed ‘ junior ideal ’ should be headed ‘ junior actual ’; 
page 128, the reference to Figure 2 is really to Figure 12; page 173, “ furniture is influenced 
by the organisation and style of teaching used, by the amount and type of furniture...” and 
so on. But these are unimportant blemishes in what is otherwise a meticulous piece of work. 

In one sense the study has come too Jate. Although it may still have some value in laying 
to rest certain popular myths about open plan schools—that they consist of large, undif- 
ferentiated teaching spaces in which informal teaching methods reign supreme—there is 
little in the conclusions which will surprise those with first-hand knowledge of a range of open 
plan schools. It should really have been available to the educational planners of the late 
1960s and 1970s who undertook the great expansion of open plan schools without any serious 
evaluation of the buildings and without giving proper training to the teachers who had to 
work in them. It does, however, provide an invaluable handbook to those responsible for the 
open plan schools of the future. Let us hope that they take heed of it. 

W. B. MARKER. 


COFFIELD, F., ROBINSON, P., and Sarssy, J. (1981). А Cycle of Deprivation? London: 
Heinemann Educational, pp. vii + 226, #11:50. 


It is ironic that a way of life which appeals to so few should interest so many. Poverty is 
one such existence. Voluntarily embraced only by small bands of ascetic clerics, such an 
existence has however long fascinated large groups of aesthetic novelists, acidic journalists 
and antiseptic social scientists. The main consequence of such a paradox, of course, is that 


398 


Book Reviews 399 


those who write about poverty have rarely experienced it. They are, unmistakably, outsiders. 
To the novelist and journalist ‘ being on the outside’ is a routine challenge, to be met by 
imagination, empathy and by whatever other projective skills they are blessed with. To the 
social scientist, however, such cultural distance is an ever present threat, constantly endanger- 
ing that most prized attribute of social science, namely its objectivity. The authors immedi- 
ately recognise the threat to their trade of " being on the outside". Many pages are spent 
agonising over the objectivity of accounts produced by researchers who have, as they admit, 
* ..no natural place in the world observed and (are) constantly interpreting the picture which 
is presented ? (p. 12). Unfortunately, however, their resolution of this age-old problem 
borders on the naive. As they put it: " Our assumption was that with involvement over time, 
many inconsistencies would be resolved and many omissions rectified " (p. 13). In itself, of 
course, there is nothing particularly disturbing about such an assumption: without time and 
involvement it is difficult to see how any outsider could ever penetrate another culture. That 
said, however, it is equally difficult to see how such an assumption safeguards objectivity, 
given that the central threat to objectivity derives, as they see it, from the fact that the 
* researcher is а complex of prejudices and opinions' (p. 12). As an antidote to prejudice 
such an assumption is not merely transparent; it is, for a team of social scientists, transparently 
uninformed. Prejudice as attitudinal research well attests is not like perfume: it does not 
evaporate over time, or dissipate on contact. 

It is of course very easy to make too much out of objectivity. Having no sophisticated 
defence against one's own prejudices is hardly a * killing matter’. That said, however, not 
having such a defence does have one awkward implication for the social scientist: he can no 
longer claim that his account is any more ‘ objective ’ than that of the novelist or the journa- 
list. Without such a defence he is in open competition with other * outsiders '. In this respect 
Coffield, Robinson and Sarsby come off badly. Lacking the sharp eye of the novelist and the 
evocative cachet of the journalist their description of poverty is bland. "True, it may be 
objected that social science has a different brief—that its intent is not merely to deScribe 
poverty but to explain it, Such a view has much to commend it, but at this level А Cycle of 
Deprivation ? has rather less. As may be surmised from the title, the authors are sceptical 
about the utility of this concept in analysing poverty. As they put it, “ The major impli- 
cation which stems from our fieldwork is that the cycle of deprivation is too simple an idea to 
explain the complex lives of these families " (p. 163). This ‘ major implication’, however, 
has one major limitation: the cycle of deprivation has never assumed poverty to be simple. 
Whatever its faults, the cycle of deprivation even in the hands of such an ardent populariser 
as Sir Keith Joseph took poverty to be ‘ cumulative’ and ‘complex’. Ironically this is the 
very first point these authors make, for on page one we read: " Sir Keith was anxious not to 
be misunderstood as suggesting that there was some single process by which social problems 
reproduce themselves” (р. 1). Given the ‘ major implication’ of this investigation, Sir 
Keith's plea evidently fell on deaf ears. 

J. MURPHY. 


Hersov, L., and BERG, I. (Eds.) (1980). Out of School. London: Wiley, рр. xii 4-367, 
#15-50. 


This book of 17 contributions, although united by the topic ої non-attendance at school, 
represents a wide range of approaches to the subject, and indeed to rather different subjects 
(truancy, separation anxiety, school phobia and school refusal). Mitchell and Shepherd's 
introductory chapter charts the subject well, and ends with the intriguing observation that 


** It scems that those children who attend school regularly while evincing signs of distress 
and antipathy and marked reluctance to do so may be seen as the casualties of a com- 
pulsory school system . . . existing studies have tended to deal with children whose efforts 
to avoid school are successful . . . we must, however, ask whether studies which concen- 
trate on the successful deviant are really exploring sufficient instances of the deviance in 
behaviour they purport to study " (p. 22). 


When one considers the implications of the approach suggested by Mitchell and Shepherd, it 
becomes difficult to overstate its importance. 

David Farrington’s chapter on ‘ Truancy, delinquency, the home and the school’ 
concludes on the basis of his work (p. 62), that " there was no evidence that secondary schools 
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had an important influence on either truancy or delinquency ". Some 40 pages later, Rey- 
nolds et al. conclude 


“ Large and consistent variations between schools in their levels of attendance . . . do 
not appear to be explicable by variation in the characteristics of their pupil intakes, 
whereas much more of the variation is explicable using only a limited range of factors 
that describe the nature and process of the pupils’ schools " (p. 105). 


The limitations of their study, notably in terms of the sample of schools used, sits ill with the 
somewhat bellicose style of the contribution. Most of the remainder of the book is written as 
though the emphasis of Farrington were right, and Reynolds wrong, since it describes legal 
and therapeutic remedies to persistent absenteeism and school refusal focused on the child. 
No doubt information on the management of school refusers is easier to come by than 
information on school factors associated with absenteeism, but if Reynolds is right the thrust 
of the effort ought to be elsewhere than it is in this book. On the basis that even if he is 
right the requirement of school attendance will not wait upon the perfection of the school 
system, the remainder of the book is likely to be of use to practitioners. Moreover, it is clear 
that the perfection of schools is never going to be more than marginally relevant to patho- 
logical separation anxiety. 

In the reviewer's opinion, the two most important contributions in the book are the 
chapter on the long-term outcome of truancy by Lee Robins and Kathryn Ratcliffe, and the 
study of school attendance and the first year of employment undertaken by Grace Gray and 
her colleagues. They are important because they set out to identify the justification for 
caring about non-attendance at school. If it is just hurt pride that our schools are not 
attracting and keeping the interest of their pupils, it matters much less than if there are 
long-term sequelae of non-attendance. Gray finds that absentees are more likely to leave 
school at the first opportunity, and without formal qualifications. Most importantly, Gray 
found " little support ' for the view that “ truancy represents a form of maladjustment which 
is likely to lead to work difficulties in much the same way that it leads to absenteeism ” 
(р. 368). In particular the results “ showed no associations between fifth year absenteeism 
and job skill level, job satisfaction, the number of jobs, dismissal from jobs or further training, 
once examination achievements had been taken into account " (р. 368). Robins and Rat- 
cliffe's chapter, based on American blacks, is less sanguine: 


** Both elementary and high school truancy were associated with dropping out of school 
before completing secondary education, and also with low earnings as an adult. High 
school truancy was strongly related to a variety of adult deviant behaviours, and some- 
what associated with psychological disturbance ... In addition to the effects discussed 
here, those truanting from elementary schools tended to marry women who had been 
similarly truant and to produce truant sons and daughters, thus perpetuating a. truant 
pattern in the next generation " (p. 81). 


Let's hope Gray's longitudinal study demonstrates in the future that these data are not 
generalisable to British schoolchildren. 

'The book reviewed is an interesting and worthwhile addition to the literature. Three 
points will remain with the reviewer. First, there is enough evidence of damage caused to 
children through absenteeism to justify the search for remedies. Second, a better under- 
standing of the dynamics of the problem will come through the study of the discontented 
school attender. Third, emphasis on change in schools or school provision rather than 
pupils would yield more in the present state of knowledge. There are parallels with the race- 
IQ controversy, and the possibility of improvement is enhanced if one behaves as though the 
differences in IQ between racial groups were of social rather than of genetic origin. Similarly, 
whatever the causes of truancy, it is at this stage helpful to behave as though they were located 
in the school, not the pupil. 

KEN PEASE. 


Lytron, Н. (1980). Parent-Child Interaction: the Socialization Process Observed т 
Twin and Singleton Families. New York: Plenum Press, pp. xx + 364, $35-00. 


This book describes what the author himself refers to as a " mammoth ' research project. 
It is based on а sample of 136 24-year-old children and their parents—not, it may seem, a 
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huge number, but when one considers the mass of material collected on each family (over 
30 hours’ worth of data collection and transcribing), the multiplicity of methodologies used 
and the numerous painstaking analyses, one comes to appreciate the enormous scale of this 
exercise. Quantitatively speaking, it is an impressive undertaking; qualitatively, it represents 
a most useful contribution to the socialisation literature. 

The aim of the book is to describe what transpires between parents and children. In this 
respect it is comparable to those other well-known investigations by Sears, Maccoby and 
Levin and by the Newsons, carried out during the 1950s and the 1960s respectively. Lytton's 
project is the equivalent for the 1970s, and as such it reflects the main changes that have come 
to characterise our thinking about the socialisation process: conceptually, regarding the 
parent-child relationship as reciprocal rather than as unilateral in nature; methodologically, 
emphasising observations rather than more indirect data-gathering techniques such as inter- 
views and questionnaires. In addition, the questions asked here reflect current interests: the 
way in which patterns of communication manifest themselves between parents and young 
children; the contingencies to be found in the attachment relationship; the contribution of 
genetic factors to variation in social behaviour, and so forth. It is thus a significant landmark 
in our study of this important topic, and though a section on practical implications is some- 
what weak and forced there can be no doubting that work of this nature will eventually 
lead not only to an understanding but also an improvement of the complex art of child rearing. 

There are a number of specific features that make this report valuable. For one thing, it 
is based on the use of multiple strategies for gathering data, as opposed to the all-eggs-in-one- 
methodological-basket policy normally adopted. In particular, the three main techniques 
used were observational counts, interviews and laboratory experiments. Lytton's comments 
about the respective uses of these methods are particularly interesting: it appears that obser- 
vations turned out to possess the greatest and experiments the least heuristic value. Then 
again, the inclusion of both monozygotic and dizygotic twins as well as singletons gave this 
study a feature that few other investigations of this type have included, namely the ability 
to weigh up the respective contributions of inherent and environmental factors. In fact the 
genetic analysis of variations in social behaviour yielded largely negative results, yet this in 
itself is of considerable interest. What is more, an opportunity was given to make twin- 
singleton comparisons both for the developmental progress of the children themselves and 
for the kind of rearing tasks which their parents face. On the whole, it seems to be a definite 
disadvantage to be a twin, and even more so to be the parent of a pair of them. And then 
finally, the inclusion of fathers in the study makes it possible to carry out some most revealing 
comparisons between the two parents: in general, the similarities between them easily 
outweigh the differences. 

Inevitably, there are reservations with regard to certain specific points of methodology 
that a report on such a complex project will give rise to: the faith in cross-lagged correlations 
for teasing out cause-effect sequences; the operational definitions of some of the concepts 
investigated; the reliance on written recordings during observational sessions, and so forth. 
But overall one must be impressed with the tremendous amount of care taken in collecting, 
analysing and reporting the data. This does make the book into а rather technical report 
which all but the specialists will find difficult to read from cover to cover. Fortunately 
summaries are given from time to time; even so, it is essentially а book for the dedicated 
research worker and specialist teacher. Whether the main contribution of this work is (as 
claimed) to provide a natural history of the socialisation process is open to debate: there are 
difficulties about generalising the findings beyond this sample (exclusively male, Canadian, 
and not necessarily representative even within those limits), and there is always the distorting 
effect of observer-presence. It is rather with regard to processes that this book contributes to 
our understanding, i.e., in respect of the manner in which parents and children affect each 
other—never mind the incidence, From that point of view this is а most valuable report. 

Н. В. SCHAFFER. 


PIATTELLI-PALMARINI, М. (Ed.) (1980). Language апа Learning—The Debate between 
Jean Piaget and Noam Chomsky. London and Henley: Routledge and Kegan 
Paul, pp. xxxvi + 409, 29-75. 


The debate concerns how to account for the necessity (acquired by all humans) and 
specificity of cognition and at the same time account for its overall flexibility and variability. 
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This question i is one thing (the only thing?) upon which all participants agree. In answering 
it various ramifications emerge. 

Piaget's constructivist answer was that only the functioning of intelligence is hereditary 
and this genetic base creates structures through the organisation of successive actions 
С reflective abstraction ") performed on objects (Chapter 1). The attainment of sensorimotor 
intelligence is sufficient to ensure language development (Chapter 2); because language is but 
one aspect of semiotic function—the major achievement of sensorimotor intelligence (Chapter 
7). Chomsky’s nativist answer is two-fold. First, the identification of general linguistic 
principles (formal universals) requires thar these structural principles are innate (Chapters 1 
and 4). Second, Chomsky does not deny that some aspects of language use may be related to 
other aspects of cognition Mn iL 5), but these other cognitive structures must also be 
innate (Chapters 6, 12 and Part П). This is because, on any inductive theory of learning, the 
acquisition of knowledge requires that the later acquisition be actually exploited in the 
earlier stage for it to be formulated as a testable hypothesis. 

From these differing view-points three major themes emerge: the aspects of cognition 
that are innate; general purpose learning procedures as opposed to specific abilities; and the 
developmental relationship between language and cognition. Piaget's minimal emphasis on 
nativism was based on his beliefs that the stability апа necessity of cognitive structures could 
not be due to random mutation, and that any account of child development must also account 
for phylogenesis. His view that the necessity and specificity of language are constructions 
from general intelligence is a consequence of the first belief. Chomsky's arguments for an 
innate linguistic ability are his detailed analysis of formal universals; the observation that no 
general purpose learning mechanisms have been proposed that are specific enough to account 
for the acquisition of such universals; and (with Fodor) the failure of any learning theory to 
account for the acquisition of concepts at all. What learning theories can account for are the 
flexibility and variability of cognition. That is, hypothesis testing and confirmation allow 
the subject to fix his/her belief that a particular hypothesis is the correct one. 

This is not an easy book, and many other points are also made. The main arguments are 
haphazardly presented in the chapters cited above. Other points raised are doubts by the 
biologists about Piaget's anti-Darwinian views (Cellérier, Changeux, Danchin); Sperber's 
proposal that symbolic function is also a specific innate ability; Thom's disagreement with 
Piaget over the child's conception of space; Petitot's use of catastrophe theory to rebut 
Fodor and Chomsky’s analysis of induction; and Bischof's critique of Piaget’s interpretation 
of Lorenz. The editor heroically provided summaries of each chapter, but drastic pruning and 
reorganisation would have made that task considerably easier. 

As for the teams, Inhelder and Papert lined up with Piaget; Fodor, Monod, Premack 
and Sperber lined up with Chomsky. Papert and Toulmin were critical of both positions, 
others were critical of one view without endorsing the other. Changeux’ elegant biological 
contribution was a masterly exercise in caution. 

So is the book worth the effort? Yes. The chapters by Chomsky and Piaget are excel- 
lent sources for concise expositions of their views. Chapters 6 and 12 provide much food for 
thought about psychological investigations; examining one’s hidden assumptions is a sobering 
exercise. Chapter 6 and Part П also give those of us who have struggled with Fodor's The 
Language of Thought а second chance. As for the debate itself, that surely must concern 
anyone involved in the study of children's cognitive abilities. 

r ROSEMARY STEVENSON. 


Змітн, P. K., and CoNNOLLy, К. J. (1981). The Ecology of Pre-school Behaviour, 
London: Cambridge University Press, pp. 383, £25-00. 


The book is an account of psycho-ecological research carried out in a specifically 
planned pre-school playgroup, the research having been supported by the Social Science 
Research Council. The authors, psychologists educated in the traditional experimental 
approach to the study of behaviour, claim to have brought experimental rigour to ecological 
study and the claim appears to be justified within the limits of their field of investigation, 
While care was taken to inform the parents of the children and to gain their consent in respect 
of the purpose of establishing the playgroup, i.e., to carry out research as well as to appoint 
suitably qualified adults for supervision, a well-defined experimental programme was 
immediately set into action. The variables of number, space, equipment, role of adult and 
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ratio of staff to children were manipulated over a period of three years while the ages of 
children in attendance were controlled together with a fair division of males and females. 
Qualities of behaviour were recorded continuously within a wide variety of simple rather than 
complex patterns in order to uphold objectivity and to establish low-inference status on the 
units. Adequate time for preparation and pilot projects was allowed and observer reliability 
wellestablished. Time sampling procedures (well reviewed) were closely considered and those 
judged most appropriate, i.e., modified focal-child samples with the data transcribed in a 
one-zero format, were finally used. 

On such a basis of meticulous planning the results bring a note of authenticity to the 
inferences about cause-effect relationships. The account is clearly written, the diagrams 
well and spaciously presented and the tables distributed through the text to the advantage of 
the reader. The five appendices give full details of the background of the children, category 
definitions, sample records, the total occurrences for the 89 behaviour units, and a copy of 
the instructions to the staff concerning the two playgroup regimes. The bibliography is 
extensive. All of this will account for the very high cost of the publication. 

Having made such detailed observations within a tight framework on one type of experi- 
ence for the pre-school child the authors look to the implications of their work and attempt 
to relate it to the overall scene. And it is here that judgment begins to be less precise. The 
authors are guilty of an over-expansive title, which should have been * The Ecology of a Play- 
group'. The cause-effect relationships established therein might well be of interest, and 
indeed, direct use to other pre-school settings since the variables chosen for manipulation are 
basic, but an educationist reading the text must be conscious that it demonstrates a lack of 
appreciation of the differences of educational value and philosophy amongst pre-school 
settings. For example, the comments of the staff of this playgroup reveal the differences 
of educational outlook which exist along the continuum of interested, partially and fully 
qualified educators of young children. 

Ethological studies, if they are to preserve the Tinbergen flavour, surely must be under- 
taken with full appreciation of the most minute nuances of change within the context together 
with shades of feeling and behaviour of all participants of the scene. Rigour should spring 
from such attention to detail rather than from the imposition of a framework to the scene. 
Perhaps the authors, if they wished to generalise, would have been wiser to have enlarged 
their team to include an educationist so that the abiding principle of ‘studying behaviour 
while leaving it alone ' might have been upheld. АП psychologist-ethologists are agreed that 
the meaning of behaviour can only be seen against a background of understanding of the 
situational context, and within the pre-school field the differing emphases related not only to 
material effects but also to educational philosophy and practice should be recognised. It 
may be that participatory research alone will provide the quality which allows for generalis- 
ation across the field. 

MARGERY С. COOPER. 


WEDELL, K., and LAMBOURNE, В. (1980). Psychological Services for Children т 
England and Wales. Division of Educational and Child Psychology Occasional 
Papers 4, 1 and 2. Leicester: British Psychological Society, pp. 84, £2-00. 


As this report stresses, there have been major changes over the past ten years or so in the 
context in which psychologists concerned with children have been working. The services 
employing psychologists (whose numbers increased substantially during the 1970s) have been 
reorganised, legislative changes have affected child care and education, and society's attitudes 
to professional help have undergone modification. Psychologists’ perceptions of their own 
roles and functions have also developed, and there has inevitably been much agonising over 
the direction in which the profession should go. It was timely, therefore, that the Division of 
Educational and Child Psychology of the British Psychological Society should attempt to 
obtain factual data relating to psychological services for children and to gather the opinions 
of psychologists about their current work and its future development. It was a good idea to 
attempt to present a picture of both educational psychologists and clinical psychologists 
working with children in the same report, though the sample of clinical psychologists involved 
(89) is much less representative than that of educational psychologists (295), and the report is 
much more successful in dealing with the latter group. It was also а good idea to organise 
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regional discussion groups, whose comments helped to add life to the responses to the 
questionnaire. 

This report does not make exciting reading but it is Iucidly written and readable, concise 
yet informative. The results are presented in well-organised tables, and are discussed soundly 
and dispassionately. In such a complex survey it is difficult to select those findings which are 
particularly worthy of note, especially as such a diversity of patterns of activity exists among 
individuals. However, on the positive side, it is encouraging that there have been increased 
demands for the services of psychologists from a variety of sources and that psychologists are 
paying more attention to their preventative role. They are also, to a growing extent, moving 
into the child's natural settings rather than remaining clinic-bound, and they are seeking a 
closer partnership with parents, teachers and others involved in the care and education of 
children. On the more negative side, there are many sources of frustration for psychologists. 
The profession is a young and highly qualified one, but promotion prospects are not very 
hopeful. So much time has to be spent on assessment that relatively little attention can be 
given to intervention; work with young children often has to be given low priority in the face 
of demands for help in crises concerning older children and adolescents; and the growing 
pressures on psychological services have not led to an appropriate increase in staff. Clinical 
psychologists, while welcoming the opportunities they have for collaboration with other 
professionals, are often frustrated by the constraints imposed on them 1n working in a medical 
setting. 

The report confines itself to the analysis of findings from the enquiry, and a discussion 
of its wider implications has been left to a BPS working party on psychological services. 
Nevertheless, it does point to a number of issues which need further consideration. For 
example, opinions are divided about the desirability of maintaining a distinction between 
" educational’ and ‘ clinical’ psychologists as far as work with children is concerned. The 
majority of both educational and clinical psychologists favour some change in the present 
pattern of professional organisation, but there is little agreement on possible alternative 
patterns. There seems to be general agreement that professional training should be broadened 
and extended to make it appropriate for a wide range of settings, though educational psycho- 
logists are not unanimous about the need for teacher training and teaching experience as part 
of their qualifications. 

These are difficult times for the development of any profession working within the 
educational, health or social services. However, it is important that psychologists should be 
concerned not only with the effects of financial cuts on their work but also with the broader 
issues relating to professional growth. This report should help to stimulate their thinking. 

MAURICE CHAZAN. 


Ware, В. (1980). Absent with Cause. London: Routledge and Kegan Paul, pp. 
xi 4-285, £5-95. 


This book describes the work of the Bayswater Centre in Bristol which “ provides full- 
time education for young people who have stopped attending comprehensive schools, and for 
whom the alternative may well be home tuition or residential provision in community 
homes or assessment centres ”. It places the Bayswater Centre in а context of alternative 
educational provision in Britain generally, and describes a visit to the Danish After-School 
at Tevind which obviously provided an important touchstone for the Bayswater Centre 
staff's feelings about their own progress. The bulk of the book is given over to a description 
of the Bayswater Centre's activities during а documented year with "а whole intake of 
youngsters '. On this level the book is an unqualified success, providing а lucid account of 
the work of sensitive and talented teachers with children given up for lost in mainstream 
education. Choices, for example the one to prepare lunch on the Centre's Baby Belling in 
preference to the other options, are described in detail and will certainly be of interest to other 
specialist centres. The decisions made may differ, but the account of the reasoning behind 
them will be of value. 

At another level, the book raises issues rather than presents one option for a workable 
solution. It is clear that Roger White regards the Bayswater Centre as at least as much a 
model for educational provision generally as а specialist venture. Не writes: 
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“What is important is what lessons (special centres) hold for general educational 
provision. The question to ask now is: what can we learn from the operation of those 
special units that might improve the provision within schools " (p. 231). 


Yet, as White admits, “ the processes of operation of many special units challenge some of 
the basis assumptions under which schools operate ” (p. 231). It is at least arguable that, 
believing what White does, the effect of the special units may be to remove the yeast from the 
possible fermentation of change within schools, to which his reply would no doubt be that 
only outside the conventional school system does there exist the necessary scope for experi- 
mentation. 

A. second important issue to consideration of which the book leads is that of the relation- 
ship between school and work. One of the Bayswater Centre's emphases was on the pro- 
vision of work experience, yet it is clear from the children in the book that the relationship is 
pretty tenuous. For example, the Centre's biggest headache ‘ Alec’ upon leaving found and 
kept a job throughout the period with which the book deals. The reviewer would like to read 
Roger White’s views on the justification for attendance at the Centre for people like Alec. 
There are many parallels between the Bayswater Centre and provisions of the criminal 
justice system like the day training centres established under the 1972 Criminal Justice Act. 
These centres are justifiable as holding stations for petty persistent offenders to avoid a worse 
fate befalling them. Places like the Bayswater Centre are at least justifiable on such modest 


grounds. 
KEN PEASE. 


WiLSON, М., and Evans, М. (1980). Education of Disturbed Pupils. Schools Council 
Working Paper 65. London: Methuen Educational, рр. 287, р. £7-00. 

Dawson, R. L. (1980). Special Provision for Disturbed Pupils: A Survey. Basing- 
stoke: Macmillan Education, рр. x+110, с. £6-95. 


Both of these books are reports on the findings of a Schools Council Project. The first 
book is the report of the directors of the project and the second summarises the statistical 
findings gleaned from questionnaires used in the project. 

Education of Disturbed Pupils describes an attempt to study, report and comment on 
successful ways of meeting the needs of disturbed pupils in educational settings. The project 
was not limited to those children in schools for the maladjusted. Some special units, classes 
and ordinary schools dealing with disturbed children were also selected. It can immediately 
be seen what a mammoth task was set for the project team. 

Methodologically, the problems of such a wide-ranging investigation seem insuperable. 
There is no adequate method of defining ‘ disturbance ’; the heterogeneity of the population 
defies taxonomy; the variety of possible approaches to treatment and the difficulties of 
identifying and measuring the myriad variables which might affect outcome present labyrin- 
thine problems. It is hardly surprising, therefore, that while the project team were well aware 
of these problems they admit an inability to contribute greatly to their solution. Instead, 
the team began with an operational definition of ‘ disturbance’, made their enquiries by 
questionnaire and follow-up visits to find out if methods recorded in the questionnaire were 
carried out in practice and to judge how well a school was working as a community. In the 
absence, however, of any empirically established criteria of what might comprise good prac- 
tice and due to limitations of time and financial resources, the team were put in the tautologi- 
cal position of selecting schools to visit in order to determine what comprises good practice 
on a recommendation from experienced people that good practice already existed in those 
schools. The authors state that they are ‘ reasonably confident ’ that those with experience of 
disturbed children would be able to judge the effectiveness of opportunities offered to pupils. 
It comes as no surprise, therefore, when the following conclusion is encountered: “ We 
have always felt that well-run caring schools make a distinct difference to the happiness and 
success of children attending them and that they create a climate conducive to good mental 
health. Our survey has confirmed us in this belief.” 

It would be unfair, however, to dwell at length upon such criticisms. What the book 
does, and does well, is to give an account of current (not necessarily good) practice. It is rich 
in ideas about ways in which the therapeutic and educational objectives might be combined; 
it contains a wealth of suggestions for a curriculum for ‘ disturbed’ children, including 
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teaching methods and materials; it draws attention to the need to involve outside agencies 
and it emphasises the importance of personal relationships in teaching such children. The 
detailed accounts of seven schools' distinctive and differing approaches, which includes some 
of the pupils’ perceptions of their schools, derived from a questionnaire, are informative and 
interesting. As a source of ideas, a basis for discussion and a reminder to the teacher of their 
importance as caring individuals, it is an effective book. 

Any disappointment experienced when reading many of the unsolved problems described 
in the first book is assuaged by the balance and perceptive presentation of the survey. Res- 
ponses to a questionnaire sent to all the special schools for the maladjusted in England and 
Wales are analysed. Sorne reference is also made to two other questionnaires sent to special 
classes/units and ordinary schools. 

The author's stated intention is to describe current practice and to identify knowledge and 
opinions about the treatment of the maladjusted perceived by staff in the special schools—a 
far less ambitious and more easily achievable aim than that of the first book. 

The questionnaire sent to special schools takes up a staggering 19 pages of the appendices 
of the book and contains 47 different areas related to schools, staff, pupils, supporting agencies, 
care and control of pupils, educational programme and teaching methods, medical and 
psychological treatment, work with families, methods of recording an assessment and factors 
impeding the work of schools. The project team’s belief that staff ої maladjusted schools are 
highly motivated was confirmed by the 66 per cent return from the 178 schools circulated. 
The layout of the questionnaire is clear and uncluttered but one or two items of this arouse 
some concern. Classification of pupils’ predominant patterns of behaviour is adapted from 
the Isle of Wight studies categories with a couple of omissions and additions. One added 
category, * Neurological Abnormalities’, includes pupils with clinical evidence of brain 
injury. It is difficult to see how this fits into an item demanding classification of behaviour. 
Moreover, respondents are required to judge the predominant feature and not to list any 
child twice. There are two obvious problems here. Is predominance to be judged on the 
basis of frequency, duration or intensity of a behaviour? Secondly, a child cannot be listed, 
for example, as both developmentally disordered and neurotic. This in itself may not be too 
important, but later analysis relating treatment to disorders can take no account of the effect 
of secondary or multiple conditions. 

The chapter on treatment programmes used by the schools and which includes four 
different analyses of the perceived effectiveness of such treatments makes fascinating reading. 
It is interesting that the quality of relationships and improvement of self-image are more 
popular than systematic behavioural or psychodynamic approaches. In general, the former 
are perceived as being effective with conduct disordered, neurotic, and pupils having a 
mixture of both disorders, Amongst other useful information is detail of outcomes and 
evidence of success of treatment in terms of, for example, leaving school under age to return to 
ordinary or other school and going on to employment. A final overview gives an excellent 
summary of the findings relating to the special classes and units as well as the schools for 
maladjusted children. 

The statistical treatment of the data is impeccable. The author has avoided the use of 
over-complicated calculations and presents the analysis in a way which will undoubtedly help 
fulfil his hope that the survey will act as a stimulus for much needed further research. 

IAN BERRY. 
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Acquiring Language in a Conversational Context presents a new perspective on a subject that has 
provoked considerable interest among psychologists and linguists concerned with language 
development—the extent to which mother-child conversation assists first language learning. Unlike 
previous studies which have concentrated on specific aspects, this book characterizes mother-child 
conversation in total before discussing its relevance to language learning. Three styles were dis- 
covered, ‘ехсигыме”, recursive”, and “discursive”, and then evaluated for the degree to which they 
could inform children about their native language and (an aspect neglected by other studies) 
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Arthur Jensen is probably the world’s best- 
known authority In mental testing. His many 
previous publications have been thoroughly 


technical in nature, even when arousing 
immense controversy. This book, however, 
supposes no background in specialized 
terminologies, it allows the ordinary reader 
to view the subject very much іп the same 
perspective as it is viewed by specialists In 
psychometrics and differential psychology. 
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Teaching handicapped children confronts us 
with the challenge of having to plan, 
deliberately and systernatically, how to teach 
a chiki to look, listen, move and speak. This 
practical book provides a basic understanding 
of the effects of a wide range of handicaps 
on the development of children at all ages. 
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Hardback 0416732607 £7.50 
Paperback 0416732704 £3.95 
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EFFECTIVE LEARNING 


By ALEX MAIN 


"There are many books on study methods, but they are not as widely 
consulted as might be supposed. Students tend to turn to their teachers 
for help with study difficulties. Yet, many teachers, tutors and lecturers 
feel limited in the extent of the advice they can give. So far there has 
been no book available which deals with this aspect of a teacher's 
work. 


Now, teachers in secondary, further and higher education can turn 
to Alex Main's book, Encouraging Effective Learning, for help in this 
difficult area, 

Alex Main has developed a successful study counselling service at 
the University of Strathclyde, and his work has attracted wide interest 
in this country and abroad. Нів book describes how other teachers can 
use his approach in advising and counselling students who have study 
difficulties. 

This book describes techniques and activities which require no 
specialised training. It creates an understanding of the styles of learning 
which different students adopt, and the difficulties they can face: and 
it shows how the sympathetic teacher can do much to relieve their 
anxiety and improve their methods. 

The book also contains a guide to books, pamphlets and audio- 
_ Nisual materials which provide additional help to students and useful 
resources for the teacher. 
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MARTIN Ө FOX & MURRAY 

Since 1971 Scotland has had a unique system of juvenile justice, in 
which a central part is played by panels of lay volunteers. In what 
was perhaps the most detailed study of any juvenile justice system 
ever carried out, the authors examined the operation of the chil- 
dren's hearings system throughout Scotland. They gave particular 
attention to the factors associated with decision-making by the 
panels themselves and by the intake officials (reporters to chil- 
dren's panels) who make the initial decision as to whether chil- 
dren referred to them by the police or other agencies are "in need 
of compulsory measures of care". These enquiries involved both a 
detailed analysis of records and systematic observation and 
recording of the interaction of the hearings themselves. Resulting 
data have been examined to throw light on the quality of practice, 
including adherence to procedural requirements and the partici- 
pation of children and parents in the proceedings. Samples of 
children and parents were personally interviewed to gain under- 
standing of their sense of the fairness of the hearings, their feelings 
of personal involvement in the process, and of issues of stigma 
and labelling. Large scale questionnaire studies made it possible 
to identify the operational philosophies of panel members and of 
the social workers who serviced the hearings. Concluding chapters 
review the implications of this project for the theory and practice 
of juvenile justice in Britain and in the United States. 
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